Bug #7718

fprintf() output gets truncated when combined with setlinebuf()

Added by Dimitrios Apostolou over 1 year ago. Updated over 1 year ago.

Status:NewStart date:2017-01-02
Priority:NormalDue date:
Assignee:-% Done:

0%

Category:lib - userland libraries
Target version:-
Difficulty:Medium Tags:needs-triage

Description

fprintf() output gets truncated when combined with setlinebuf(). See the attached test, it gives me the following output:

$ gcc -std=c99 fprintf_truncation-2.c
$ ./a.out 
 verbose: length=240  234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789
 verbose: length=241  2345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890
 verbose: length=242  23456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901
 verbose: length=243  234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012
 verbose: length=244  2345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123
 verbose: length=245  23456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234
 verbose: length=246  234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345
 verbose: length=247  234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345
 verbose: length=248  23456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567
 verbose: length=249  234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678

Notice how the output of length=247 is the same as length=246.

Disclaimer: I don't have an illumos installation, but I reproduced and researched this on Solaris 11. I couldn't find any better place to report this to.

fprintf_truncation-2.c Magnifier (934 Bytes) Dimitrios Apostolou, 2017-01-02 04:55 PM

History

#1 Updated by Dimitrios Apostolou over 1 year ago

truss is showing that all write() system calls are chunked in up to 128-byte chunks, but after the overflow has occured, some chunks are 129 bytes.

 verbose: length=240  2345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567write(1, "   v e r b o s e :   l e".., 128)     = 128
89012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789
write(1, " 8 9 0 1 2 3 4 5 6 7 8 9".., 123)     = 123
 verbose: length=241  2345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567write(1, "   v e r b o s e :   l e".., 128)     = 128
890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890
write(1, " 8 9 0 1 2 3 4 5 6 7 8 9".., 124)     = 124
 verbose: length=242  2345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567write(1, "   v e r b o s e :   l e".., 128)     = 128
8901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901
write(1, " 8 9 0 1 2 3 4 5 6 7 8 9".., 125)     = 125
 verbose: length=243  2345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567write(1, "   v e r b o s e :   l e".., 128)     = 128
89012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012
write(1, " 8 9 0 1 2 3 4 5 6 7 8 9".., 126)     = 126
 verbose: length=244  2345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567write(1, "   v e r b o s e :   l e".., 128)     = 128
890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123
write(1, " 8 9 0 1 2 3 4 5 6 7 8 9".., 127)     = 127
 verbose: length=245  2345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567write(1, "   v e r b o s e :   l e".., 128)     = 128
8901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234
write(1, " 8 9 0 1 2 3 4 5 6 7 8 9".., 128)     = 128
 verbose: length=246  2345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567write(1, "   v e r b o s e :   l e".., 128)     = 128
89012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345write(1, " 8 9 0 1 2 3 4 5 6 7 8 9".., 128)     = 128

write(1, "\n", 1)                               = 1
 verbose: length=247  2345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567write(1, "   v e r b o s e :   l e".., 128)     = 128
89012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345write(1, " 8 9 0 1 2 3 4 5 6 7 8 9".., 128)     = 128

write(1, "\n", 1)                               = 1
 verbose: length=248  2345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567write(1, "   v e r b o s e :   l e".., 128)     = 128
890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456write(1, " 8 9 0 1 2 3 4 5 6 7 8 9".., 129)    = 129
7
write(1, " 7\n", 2)                             = 2
 verbose: length=249  2345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567write(1, "   v e r b o s e :   l e".., 128)     = 128
890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456write(1, " 8 9 0 1 2 3 4 5 6 7 8 9".., 129)    = 129
78
write(1, " 7 8\n", 3)                           = 3

#2 Updated by Robert Mustacchi over 1 year ago

Thanks for writing this up. I've spent some time going through this and I think
I understand what's happening here. I've focused primarily on the case in your
program where length=247 is incorrect. Though I'm not 100% sure what the
right fix is at this time.

The first thing I went and verified was that we had the right data being passed
to fprintf and that it made it to _dowrite and _fwrite_unlocked correctly.
The good news, is that it does as we'd expect.

From here, I used a little bit of mdb single-stepping with some breakpoints
focusing on how we traverse through the writing of that string to the iob and
from there how it gets written out.

First some background (focusing on the 64-bit code path, though the same
behavior should be true for the 32-bit):

1. We enter fprintf()
2. We process the %s and call _dowrite() with our entire buffer
3. That gets translated into a single _fwrite_unlocked call.
4. We'll put the first 128 bytes into the iob buffer, at which point
we'll call _flsbuf(). Because we've written the entire buffer,
we'll flush it out via _xlsbuf(), and then recursively call
_flsbuf().
5. The _xlsbuf() call mentioned above writes out all the first 128
characters as expected.
6. The recursive _flsbuf() call is where things get interesting and
start going off the rails. We hit the case where we insert that value
into the buffer; however, we then just zero out _cnt.
7. We then writeout the remaining characters to the buffer. Now, the
interesting thing is that we reset _cnt to zero, but we have a character
in the buffer. This basically breaks the invariant of what it's
supposed to mean for _fwrite_unlocked(). This means that when we
finish our call note what we end up with for data:

> 411b40::print rm_file_t
{
    _ptr = 0x412555
    _base = 0x4124d4
    _end = 0x412554
    _cnt = 0xffffffffffffff80
    _file = 0x1
    _flag = 0x14a
...
> 0xffffffffffffff80=D
                -128            

So, _ptr is beyond _end! Normally this would be a memory buffer
violation, but we've allocated some extra memory to align things, so
ironically, we're actually okay and not writing to garbage (I think).
However, this is the start of where everything goes wrong.

8. We come back out of _dowrite() and are in the main _ndoprnt()
loop. At which point we process the '\n' case. This calls _dowrite()
first.

9. The _dowrite() call ends up calling _bufsync, which notices that
_ptr is beyond _end and sets _ptr to _end and it resets _cnt in the
process.

10. This ends up having us hit the same loop that we hit in 4. Because
we're at the end of the buffer, we end up going into the flush and
because it's a \n, we make sure that we flush everything. This is where
everthing goes wrong and we end up eliding the last character.

I think the key here is that we end up having enough characters to write
such that we have a 128 byte and 127 byte write followed by the 1
character new line flush. The fact that we set the _cnt to zero in
_flsbuf() is rather suspicious. There are two approaches that I see
here, though I'm not sure which would be better:

1. Set the value of _cnt after the _flsbuf() writing of the character
to mimic what _fwrite_unlocked() would expect.

2. Re-initialize the value of _cnt in _fwrite_unlocked() after a call
to _flsbuf(), basically saying that we don't trust it.

#3 Updated by Dimitrios Apostolou over 1 year ago

I appreciate the investigation! Where can I see the source code for the functions you are mentioning? Is there a git repository web frontend?

Also available in: Atom