[Bug 10891] New: grep extremely slow

bugzilla at busybox.net bugzilla at busybox.net
Fri Mar 23 14:00:08 UTC 2018


https://bugs.busybox.net/show_bug.cgi?id=10891

            Bug ID: 10891
           Summary: grep extremely slow
           Product: Busybox
           Version: 1.27.x
          Hardware: All
                OS: Linux
            Status: NEW
          Severity: enhancement
          Priority: P5
         Component: Other
          Assignee: unassigned at busybox.net
          Reporter: tim.ruehsen at gmx.de
                CC: busybox-cvs at busybox.net
  Target Milestone: ---

Busybox's grep performs very slow within a 'make syntax-check' run.

I tracked it down to a grep which takes 0.28s with GNU grep and 47s with
busybox's grep.

There are 770 patterns, all of the form
^ *# *(define|undef)  *AI_ADDRCONFIG\>

What changes from pattern to pattern is only the 'AI_ADDRCONFIG' part.

The number of files doesn't matter. When concatenating all 5700 files into one
file (~2M, mostly C sources) the performance stays as high as with all 5700
files.

The grep command line is like 'cat patterns|grep -E -f - file(s)'.

I assume a very simple optimization in GNU grep: if all patterns begin with the
same sequence and the rest is a simple string, then reduce to one pattern + a
list of memcmp() calls. So the extra code wouldn't be too much I guess.

Would be lovely to see this built in.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


More information about the busybox-cvs mailing list