Feature #13040
closedgrep can leverage bmg in more cases
100%
Description
In the internal implementation, there are basically two engines that are used by grep:
1. regcomp(3C) and regexec(3C)
2. Boyer-Moore-Gosper algorithm (bmg)
Right now the use of the bmg algorithm is limited to fgrep with certain constraints, such as a single pattern, not using the -i
or -v
flag, etc. However, there are a large number of grep and egrep patterns that don't have any special characters that imply things in a regular expression. As such, if we encounter a pattern that doesn't use any of the special characters, then this dramatically reduces the cost of grep.
For example, if you take a look at some of the cases of #12825, there are a few pessimal cases. While the current bmg algorithm doesn't handle -v, if we use the two examples provided by Jonathan and Periklis, we can see some dramatic improvements. In this case the ./grep
invocation is the one that has enabled this.
$ ptime grep -c aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa testfile 200000 real 1.921856648 user 1.909033754 sys 0.011928132 $ ptime ./grep -c aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa testfile 200000 real 0.046147721 user 0.034433071 sys 0.011271350
This doesn't help all of the cases that #12825 brings up, but it at least improves things a bit.
Related issues
Updated by Robert Mustacchi almost 2 years ago
- Related to Bug #12825: grep -x is very slow added
Updated by Robert Mustacchi almost 2 years ago
I tested this with the updated grep test suite, test run below. I also spot checked the performance of a number of different simple patters as included above. Finally, I ran a wsdiff that compared this workspace being built on an onu'd platform with all of the grep changes to one without. The only differences were limited to the known set related to DOF and a few timestamps that have snuck in over the years.
rm@beowulf:~$ pfexec /opt/util-tests/bin/utiltest Test: /opt/util-tests/tests/allowed-ips (run as root) [00:00] [PASS] Test: /opt/util-tests/tests/chown_test (run as root) [00:00] [PASS] Test: /opt/util-tests/tests/date_test (run as root) [00:00] [PASS] Test: /opt/util-tests/tests/find/findtest (run as root) [00:00] [PASS] Test: /opt/util-tests/tests/grep_test (run as root) [00:03] [PASS] Test: /opt/util-tests/tests/libjedec_test (run as root) [00:00] [PASS] Test: /opt/util-tests/tests/libsff/libsff (run as root) [00:00] [PASS] Test: /opt/util-tests/tests/make_test (run as root) [00:00] [PASS] Test: /opt/util-tests/tests/mdb/mdbtest (run as root) [00:00] [PASS] Test: /opt/util-tests/tests/mergeq/mqt (run as root) [00:00] [PASS] Test: /opt/util-tests/tests/mergeq/wqt (run as root) [00:00] [PASS] Test: /opt/util-tests/tests/printf_test (run as root) [00:00] [PASS] Test: /opt/util-tests/tests/set-linkprop (run as root) [00:00] [PASS] Test: /opt/util-tests/tests/sleep/sleeptest (run as root) [00:22] [PASS] Test: /opt/util-tests/tests/smbios (run as root) [00:00] [PASS] Test: /opt/util-tests/tests/xargs_test (run as root) [00:00] [PASS] Test: /opt/util-tests/tests/awk/runtests.sh (run as nobody) [02:43] [PASS] Test: /opt/util-tests/tests/ctf/precheck (run as root) [00:00] [PASS] Test: /opt/util-tests/tests/ctf/ctftest (run as root) [00:06] [PASS] Test: /opt/util-tests/tests/demangle/afl-fast (run as root) [00:01] [PASS] Test: /opt/util-tests/tests/demangle/gcc-libstdc++ (run as root) [00:00] [PASS] Test: /opt/util-tests/tests/demangle/llvm-stdcxxabi (run as root) [00:00] [PASS] Test: /opt/util-tests/tests/libcustr/custr_remove (run as root) [00:00] [PASS] Test: /opt/util-tests/tests/libcustr/custr_trunc (run as root) [00:00] [PASS] Test: /opt/util-tests/tests/libnvpair_json/json_00_blank (run as root) [00:00] [PASS] Test: /opt/util-tests/tests/libnvpair_json/json_01_boolean (run as root) [00:00] [PASS] Test: /opt/util-tests/tests/libnvpair_json/json_02_numbers (run as root) [00:00] [PASS] Test: /opt/util-tests/tests/libnvpair_json/json_03_empty_arrays (run as root) [00:00] [PASS] Test: /opt/util-tests/tests/libnvpair_json/json_04_number_arrays (run as root) [00:00] [PASS] Test: /opt/util-tests/tests/libnvpair_json/json_05_strings (run as root) [00:00] [PASS] Test: /opt/util-tests/tests/libnvpair_json/json_06_nested (run as root) [00:00] [PASS] Test: /opt/util-tests/tests/libnvpair_json/json_07_nested_arrays (run as root) [00:00] [PASS] Results Summary PASS 32 Running Time: 00:03:21 Percent passed: 100.0% Log directory: /var/tmp/test_results/20200819T141521
Updated by Electric Monk almost 2 years ago
- Status changed from New to Closed
- % Done changed from 90 to 100
git commit dbe930bf51e0d7458b24d30e9f25756c5da54ddf
commit dbe930bf51e0d7458b24d30e9f25756c5da54ddf Author: Robert Mustacchi <rm@fingolfin.org> Date: 2020-08-25T23:00:10.000Z 13040 grep can leverage bmg in more cases Reviewed by: Andy Fiddaman <andy@omniosce.org> Reviewed by: Toomas Soome <tsoome@me.com> Approved by: Gordon Ross <gordon.w.ross@gmail.com>