Bug #5366
closedstrcoll_l may destroy its arguments, then crash
100%
Description
#include <stdio.h> #include <locale.h> /* * LC_ALL=en_US.UTF-8 ./foo * * ... * * status: process terminated by SIGSEGV (Segmentation Fault), addr=0 * > $C * $C * 080478f8 libc.so.1`strcmp+0xe7(0, 8050e28, a, 80479b0, fee01a00, fefcebe9) * 080479f8 libc.so.1`strcoll_l+0x208(0, 8050e28, fee01a00, fee01a20, 8047a48) * 08047a18 libc.so.1`strcoll+0x34(8050e2f, 8050e25) * 08047a48 main+0x40(1, 8047a70, 8047a78, 8050b8e, 8050e00, 0) * 08047a64 _start+0x83(1, 8047b58, 0, 8047b5e, 8047b9f, 8047c24) */ int main(int argc, char **argv) { setlocale(LC_ALL, ""); printf("%d\\n", strcoll("wcswidth", "loc\\341ltime")); }
This crashes because strcoll_l (which strcoll uses) passes its arguments directly to mbsrtowcs_l, which modifies its argument. This means in the error case, we pass NULL (and a partial string) to strcmp(), and crash.
Even if we didn't crash, modifying the arguments given to strcoll_l is bad. We should give it a variable to modify (which we then ignore), rather than passing the address of our parameters:
like:
diff --git a/usr/src/lib/libc/port/locale/strcoll.c b/usr/src/lib/libc/port/locale/strcoll.c index 55cf459..0b2a893 100644 --- a/usr/src/lib/libc/port/locale/strcoll.c +++ b/usr/src/lib/libc/port/locale/strcoll.c @@ -54,6 +54,9 @@ strcoll_l(const char *s1, const char *s2, locale_t loc) wchar_t *w1 = NULL, *w2 = NULL; size_t sz1, sz2; const struct lc_collate *lcc = loc->collate; + /* Copies that mbsrtocws_l can't trash */ + const char *ss1 = s1; + const char *ss2 = s2; mbstate_t mbs1 = { 0 }; /* initial states */ mbstate_t mbs2 = { 0 }; @@ -89,10 +92,10 @@ strcoll_l(const char *s1, const char *s2, locale_t loc) goto error; } - if ((mbsrtowcs_l(w1, &s1, sz1, &mbs1, loc)) == (size_t)-1) + if ((mbsrtowcs_l(w1, &ss1, sz1, &mbs1, loc)) == (size_t)-1) goto error; - if ((mbsrtowcs_l(w2, &s2, sz2, &mbs2, loc)) == (size_t)-1) + if ((mbsrtowcs_l(w2, &ss2, sz2, &mbs2, loc)) == (size_t)-1) goto error; ret = wcscoll_l(w1, w2, loc);
Updated by Rich Lowe over 8 years ago
If a guess on jeffpc's part is correct, and it seems likely: we should just use mbstowcs_l, the difference being the 'r' meaning 'restartable' and implying the argument stomping.
diff --git a/usr/src/lib/libc/port/locale/strcoll.c b/usr/src/lib/libc/port/locale/strcoll.c index 55cf459..1aaace6 100644 --- a/usr/src/lib/libc/port/locale/strcoll.c +++ b/usr/src/lib/libc/port/locale/strcoll.c @@ -55,9 +55,6 @@ strcoll_l(const char *s1, const char *s2, locale_t loc) size_t sz1, sz2; const struct lc_collate *lcc = loc->collate; - mbstate_t mbs1 = { 0 }; /* initial states */ - mbstate_t mbs2 = { 0 }; - if (lcc->lc_is_posix) return (strcmp(s1, s2)); @@ -89,10 +86,10 @@ strcoll_l(const char *s1, const char *s2, locale_t loc) goto error; } - if ((mbsrtowcs_l(w1, &s1, sz1, &mbs1, loc)) == (size_t)-1) + if ((mbstowcs_l(w1, s1, sz1, loc)) == (size_t)-1) goto error; - if ((mbsrtowcs_l(w2, &s2, sz2, &mbs2, loc)) == (size_t)-1) + if ((mbstowcs_l(w2, s2, sz2, loc)) == (size_t)-1) goto error; ret = wcscoll_l(w1, w2, loc);
Updated by Garrett D'Amore over 8 years ago
Hmm... this uncovers another problem or two, which is that mbstowcs() uses a static mbstate_t, and doesn't reinitialize it. That's a bug! (Some functions are supposed to use global state and preserve across calls. mbstowcs -- at least according to POSIX -- is not one of them -- this is probably an error on my part.)
(I also noticed that apparently the man page for mbstowcs points at mbsrtowcs, but lacks any documentation for the non-reentrant version. Ooops!)
If I read this properly, I think yes, strcoll() could use mbstowcs() (and strcoll_l() could use mbstowcs_l()) -- provided that the mbstate_t bug in mbstowcs is fixed.
(Probably few people notice this problem, because we don't really update the mbstate_t structures for most encodings.)
Thanks for finding this problem -- I have to take complete blame for screwing this up, as I wrote the faulty code.
I'm happy to take this bug and fix, or to review work by others.
Updated by Rich Lowe over 8 years ago
I'm not seeing the mbstowcs() bug you mean. mbstowcs() just calls the _l variant, and neither 'initial' nor 'mbs' in the _l variant are pointers, so that's structure assignment and a copy.
That said, that also means that's a bit useless, so we shouldn't do it either way.
Updated by Garrett D'Amore over 8 years ago
Ah, I looked again -- the mbs is assigned to from initial (a copy is made), so its ok. Thanks for clarifying that.
Updated by Garrett D'Amore over 8 years ago
And, just for the record, your change looks good.
Updated by Electric Monk over 8 years ago
- Status changed from New to Closed
- % Done changed from 40 to 100
git commit 83ff55dcd7fc7c3356d7b2d3f67ec99970728d9b
commit 83ff55dcd7fc7c3356d7b2d3f67ec99970728d9b Author: Richard Lowe <richlowe@richlowe.net> Date: 2014-12-03T20:13:41.000Z 5366 strcoll_l may destroy its arguments, then crash Reviewed by: Garrett D'Amore <garrett@damore.org> Reviewed by: Robert Mustacchi <rm@joyent.com> Approved by: Dan McDonald <danmcd@omniti.com>