ksh93 builtin *grep -v mishandles blank lines, blows up libgcrypt-config
I'll start with the reduced test case. Note that blank lines matter in the following:
$ type grep
grep is a shell builtin version of /usr/xpg4/bin/grep
$ echo "" > /tmp/oneline
$ grep -v aa /tmp/oneline
$ grep -v aaa /tmp/oneline
$ /usr/bin/grep -v aaa /tmp/oneline
We create a file consisting of a single blank line. grep -v is supposed to print all lines not matching a pattern, but it's not printing the blank line when the pattern is "aaa", but it's behaving correctly when the pattern is "aa".
This is clearly erroneous - for any input, "all lines not matching aaa" should be a superset of "all lines not matching aa".
I haven't fully characterized the boundary of where it's broken but it looks like in some cases it loses one blank line in the output if an inverted literal pattern is more than two characters.
Where the bug probably lives:
- The code for the builtin greps live in usr/src/lib/libcmd/common/grep.c using the regexp library in usr/src/lib/libast
- Configuration for which programs get silently replaced by shell builtins is in usr/src/lib/libshell/common/data/solaris_cmdlist.h
How did I notice this:
On openindiana-hipster, if /usr/xpg4/bin is in your path ahead of /usr/bin, the command "libgcrypt-config --libs" produces empty output (when it should have printed "-lgcrypt -lgpg-error").
libgcrypt-config contains several instances of code which looks like this:
for i in $libdirs $libs_final; do
if echo "$tmp" | fgrep
v -"$i" >/dev/null; then
which is attempting to remove duplicates from its output while otherwise maintaining the original order. I say "attempts" because this code also has a bug - consider the case where libs_final contains "-laaaa -laa"; it will only print "-laaaa" because the "-laa" is dropped because it matches "-laaaa". In practice it doesn't matter for libgcrypt-config but who knows where this code was cut & pasted from or will be cut & pasted to...
And I was looking at that because I was investigating why the netatalk I built blew up with:
uam_load(uams_dhx2.so): failed to load: ld.so.1: afpd: fatal: relocation error: file .../uams_dhx2.so: symbol gcry_mpi_release: referenced symbol not found
I first discovered that changing the libgcrypt-config script to run with bash instead of sh caused it to behave properly, which made it specific to shell.
I later discovered that I was not the person down this particular path. See https://github.com/joyent/pkgsrc-legacy/issues/21 which is clearly the exact same bug in joyent's pkgsrc builds, found for a similar reason (program built missing a library dependency).
The submitter of that bug went down a few blind alleys and probably misunderstood shell quoting and/or the intent of the code, but did include another clue: removing /usr/xpg4/bin from PATH makes the problem go away.
See also issue #3754 which is a more general complaint about surprising behavior of ksh93 shell builtins.
And https://xkcd.com/979/ for approximately how I felt halfway through digging through this.