sh: mishandles backslash as last character of a block of input

Added by Zack Weinberg 15 days ago. Updated 1 day ago.

Some versions of Illumos /bin/sh read script files in 8192-byte chunks. If the last character of one of these chunks is a backslash, it may be mishandled.

The specific bug I know about is: If the last line of a here-document straddles a chunk boundary, and the last character of that line is not a backslash, but the last character of the first chunk is a backslash, and the first character of the second chunk is a dollar sign, then the here-document will be written out without a final newline. (I'm not sure whether the boundary has to fall within the last line of the here-document, or whether the first character of the second chunk has to be a dollar sign. I know \# doesn't work, though.)

Because this appears to be a problem with low-level input handling, it's possible that this is not the only way backslash at a chunk boundary can cause a shell script to malfunction.

The attached file is a self-contained reproducer for the here-document bug. If you uncompress it and run it with a shell that has this bug, it will print something like this:

cmp: EOF on /tmp/tmp.LCyimJxG8v after byte 1, in line 1
+ xxd /tmp/tmp.q5iPccsLOn
00000000: 240a                                     $.
+ xxd /tmp/tmp.LCyimJxG8v
00000000: 24                                       $
+ exit 1
+ rm -f /tmp/tmp.q5iPccsLOn /tmp/tmp.LCyimJxG8v

Note that the script does not have a #! line. Also, you need to have xxd for it to work as intended, but both invocations of this utility are after the critical offset in the file, so you can safely change them to any other hex-dump program you may happen to have around without breaking the test. I have also attached the program I used to generate the test script, which may be easier to tinker with. The name t-001fa2 refers to the number of padding characters in the script, before the actual code.

This bug was originally reported as a problem running the Autoconf test suite on OmniOS. See for further discussion.

I've confirmed that this issue is resolved by the upgraded version of ksh that I am working on in #13405

% ksh ~/ksh/chunktest
cmp: EOF on /tmp/tmp_12.tl9
+ xxd /tmp/tmp_1u.end
00000000: 240a                                     $.
+ xxd /tmp/tmp_12.tl9
00000000: 24                                       $
+ exit 1
+ rm -f /tmp/tmp_1u.end /tmp/tmp_12.tl9

% LD_LIBRARY_PATH=lib:usr/lib usr/bin/i86/ksh ~/ksh/old/test
% LD_LIBRARY_PATH=lib/64:usr/lib/64 usr/bin/amd64/ksh ~/ksh/old/test

