Bug #7718
openfprintf() output gets truncated when combined with setlinebuf()
0%
Description
fprintf()
output gets truncated when combined with setlinebuf()
. See the attached test, it gives me the following output:
$ gcc -std=c99 fprintf_truncation-2.c $ ./a.out verbose: length=240 234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789 verbose: length=241 2345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890 verbose: length=242 23456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901 verbose: length=243 234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012 verbose: length=244 2345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123 verbose: length=245 23456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234 verbose: length=246 234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345 verbose: length=247 234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345 verbose: length=248 23456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567 verbose: length=249 234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678
Notice how the output of length=247
is the same as length=246
.
Disclaimer: I don't have an illumos installation, but I reproduced and researched this on Solaris 11. I couldn't find any better place to report this to.
Files
Updated by Dimitrios Apostolou over 6 years ago
truss
is showing that all write()
system calls are chunked in up to 128-byte chunks, but after the overflow has occured, some chunks are 129 bytes.
verbose: length=240 2345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567write(1, " v e r b o s e : l e".., 128) = 128 89012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789 write(1, " 8 9 0 1 2 3 4 5 6 7 8 9".., 123) = 123 verbose: length=241 2345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567write(1, " v e r b o s e : l e".., 128) = 128 890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890 write(1, " 8 9 0 1 2 3 4 5 6 7 8 9".., 124) = 124 verbose: length=242 2345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567write(1, " v e r b o s e : l e".., 128) = 128 8901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901 write(1, " 8 9 0 1 2 3 4 5 6 7 8 9".., 125) = 125 verbose: length=243 2345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567write(1, " v e r b o s e : l e".., 128) = 128 89012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012 write(1, " 8 9 0 1 2 3 4 5 6 7 8 9".., 126) = 126 verbose: length=244 2345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567write(1, " v e r b o s e : l e".., 128) = 128 890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123 write(1, " 8 9 0 1 2 3 4 5 6 7 8 9".., 127) = 127 verbose: length=245 2345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567write(1, " v e r b o s e : l e".., 128) = 128 8901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234 write(1, " 8 9 0 1 2 3 4 5 6 7 8 9".., 128) = 128 verbose: length=246 2345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567write(1, " v e r b o s e : l e".., 128) = 128 89012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345write(1, " 8 9 0 1 2 3 4 5 6 7 8 9".., 128) = 128 write(1, "\n", 1) = 1 verbose: length=247 2345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567write(1, " v e r b o s e : l e".., 128) = 128 89012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345write(1, " 8 9 0 1 2 3 4 5 6 7 8 9".., 128) = 128 write(1, "\n", 1) = 1 verbose: length=248 2345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567write(1, " v e r b o s e : l e".., 128) = 128 890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456write(1, " 8 9 0 1 2 3 4 5 6 7 8 9".., 129) = 129 7 write(1, " 7\n", 2) = 2 verbose: length=249 2345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567write(1, " v e r b o s e : l e".., 128) = 128 890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456write(1, " 8 9 0 1 2 3 4 5 6 7 8 9".., 129) = 129 78 write(1, " 7 8\n", 3) = 3
Updated by Robert Mustacchi over 6 years ago
Thanks for writing this up. I've spent some time going through this and I think
I understand what's happening here. I've focused primarily on the case in your
program where length=247 is incorrect. Though I'm not 100% sure what the
right fix is at this time.
The first thing I went and verified was that we had the right data being passed
to fprintf and that it made it to _dowrite
and _fwrite_unlocked
correctly.
The good news, is that it does as we'd expect.
From here, I used a little bit of mdb single-stepping with some breakpoints
focusing on how we traverse through the writing of that string to the iob and
from there how it gets written out.
First some background (focusing on the 64-bit code path, though the same
behavior should be true for the 32-bit):
1. We enter fprintf()
2. We process the %s and call _dowrite()
with our entire buffer
3. That gets translated into a single _fwrite_unlocked
call.
4. We'll put the first 128 bytes into the iob buffer, at which point
we'll call _flsbuf()
. Because we've written the entire buffer,
we'll flush it out via _xlsbuf()
, and then recursively call
_flsbuf()
.
5. The _xlsbuf()
call mentioned above writes out all the first 128
characters as expected.
6. The recursive _flsbuf()
call is where things get interesting and
start going off the rails. We hit the case where we insert that value
into the buffer; however, we then just zero out _cnt.
7. We then writeout the remaining characters to the buffer. Now, the
interesting thing is that we reset _cnt to zero, but we have a character
in the buffer. This basically breaks the invariant of what it's
supposed to mean for _fwrite_unlocked()
. This means that when we
finish our call note what we end up with for data:
> 411b40::print rm_file_t { _ptr = 0x412555 _base = 0x4124d4 _end = 0x412554 _cnt = 0xffffffffffffff80 _file = 0x1 _flag = 0x14a ... > 0xffffffffffffff80=D -128
So, _ptr is beyond _end! Normally this would be a memory buffer
violation, but we've allocated some extra memory to align things, so
ironically, we're actually okay and not writing to garbage (I think).
However, this is the start of where everything goes wrong.
8. We come back out of _dowrite()
and are in the main _ndoprnt
()
loop. At which point we process the '\n' case. This calls _dowrite()
first.
9. The _dowrite()
call ends up calling _bufsync
, which notices that
_ptr is beyond _end and sets _ptr to _end and it resets _cnt in the
process.
10. This ends up having us hit the same loop that we hit in 4. Because
we're at the end of the buffer, we end up going into the flush and
because it's a \n, we make sure that we flush everything. This is where
everthing goes wrong and we end up eliding the last character.
I think the key here is that we end up having enough characters to write
such that we have a 128 byte and 127 byte write followed by the 1
character new line flush. The fact that we set the _cnt to zero in_flsbuf()
is rather suspicious. There are two approaches that I see
here, though I'm not sure which would be better:
1. Set the value of _cnt after the _flsbuf()
writing of the character
to mimic what _fwrite_unlocked()
would expect.
2. Re-initialize the value of _cnt in _fwrite_unlocked()
after a call
to _flsbuf()
, basically saying that we don't trust it.
Updated by Dimitrios Apostolou over 6 years ago
I appreciate the investigation! Where can I see the source code for the functions you are mentioning? Is there a git repository web frontend?