Project

General

Profile

Bug #5366

strcoll_l may destroy its arguments, then crash

Added by Rich Lowe over 5 years ago. Updated over 5 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
lib - userland libraries
Start date:
2014-11-21
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
Gerrit CR:

Description

#include <stdio.h>
#include <locale.h>

/*
 * LC_ALL=en_US.UTF-8 ./foo
 *
 * ...
 *
 * status: process terminated by SIGSEGV (Segmentation Fault), addr=0
 * > $C
 * $C
 * 080478f8 libc.so.1`strcmp+0xe7(0, 8050e28, a, 80479b0, fee01a00, fefcebe9)
 * 080479f8 libc.so.1`strcoll_l+0x208(0, 8050e28, fee01a00, fee01a20, 8047a48)
 * 08047a18 libc.so.1`strcoll+0x34(8050e2f, 8050e25)
 * 08047a48 main+0x40(1, 8047a70, 8047a78, 8050b8e, 8050e00, 0)
 * 08047a64 _start+0x83(1, 8047b58, 0, 8047b5e, 8047b9f, 8047c24)
 */

int
main(int argc, char **argv)
{
    setlocale(LC_ALL, "");
    printf("%d\\n", strcoll("wcswidth", "loc\\341ltime"));
}

This crashes because strcoll_l (which strcoll uses) passes its arguments directly to mbsrtowcs_l, which modifies its argument. This means in the error case, we pass NULL (and a partial string) to strcmp(), and crash.

Even if we didn't crash, modifying the arguments given to strcoll_l is bad. We should give it a variable to modify (which we then ignore), rather than passing the address of our parameters:

like:

diff --git a/usr/src/lib/libc/port/locale/strcoll.c b/usr/src/lib/libc/port/locale/strcoll.c
index 55cf459..0b2a893 100644
--- a/usr/src/lib/libc/port/locale/strcoll.c
+++ b/usr/src/lib/libc/port/locale/strcoll.c
@@ -54,6 +54,9 @@ strcoll_l(const char *s1, const char *s2, locale_t loc)
     wchar_t *w1 = NULL, *w2 = NULL;
     size_t sz1, sz2;
     const struct lc_collate *lcc = loc->collate;
+    /* Copies that mbsrtocws_l can't trash */
+    const char *ss1 = s1;
+    const char *ss2 = s2;

     mbstate_t mbs1 = { 0 };    /* initial states */
     mbstate_t mbs2 = { 0 };
@@ -89,10 +92,10 @@ strcoll_l(const char *s1, const char *s2, locale_t loc)
             goto error;
     }

-    if ((mbsrtowcs_l(w1, &s1, sz1, &mbs1, loc)) == (size_t)-1)
+    if ((mbsrtowcs_l(w1, &ss1, sz1, &mbs1, loc)) == (size_t)-1)
         goto error;

-    if ((mbsrtowcs_l(w2, &s2, sz2, &mbs2, loc)) == (size_t)-1)
+    if ((mbsrtowcs_l(w2, &ss2, sz2, &mbs2, loc)) == (size_t)-1)
         goto error;

     ret = wcscoll_l(w1, w2, loc);

History

#1

Updated by Rich Lowe over 5 years ago

If a guess on jeffpc's part is correct, and it seems likely: we should just use mbstowcs_l, the difference being the 'r' meaning 'restartable' and implying the argument stomping.

diff --git a/usr/src/lib/libc/port/locale/strcoll.c b/usr/src/lib/libc/port/locale/strcoll.c
index 55cf459..1aaace6 100644
--- a/usr/src/lib/libc/port/locale/strcoll.c
+++ b/usr/src/lib/libc/port/locale/strcoll.c
@@ -55,9 +55,6 @@ strcoll_l(const char *s1, const char *s2, locale_t loc)
     size_t sz1, sz2;
     const struct lc_collate *lcc = loc->collate;

-    mbstate_t mbs1 = { 0 };    /* initial states */
-    mbstate_t mbs2 = { 0 };
-
     if (lcc->lc_is_posix)
         return (strcmp(s1, s2));

@@ -89,10 +86,10 @@ strcoll_l(const char *s1, const char *s2, locale_t loc)
             goto error;
     }

-    if ((mbsrtowcs_l(w1, &s1, sz1, &mbs1, loc)) == (size_t)-1)
+    if ((mbstowcs_l(w1, s1, sz1, loc)) == (size_t)-1)
         goto error;

-    if ((mbsrtowcs_l(w2, &s2, sz2, &mbs2, loc)) == (size_t)-1)
+    if ((mbstowcs_l(w2, s2, sz2, loc)) == (size_t)-1)
         goto error;

     ret = wcscoll_l(w1, w2, loc);
#2

Updated by Garrett D'Amore over 5 years ago

Hmm... this uncovers another problem or two, which is that mbstowcs() uses a static mbstate_t, and doesn't reinitialize it. That's a bug! (Some functions are supposed to use global state and preserve across calls. mbstowcs -- at least according to POSIX -- is not one of them -- this is probably an error on my part.)

(I also noticed that apparently the man page for mbstowcs points at mbsrtowcs, but lacks any documentation for the non-reentrant version. Ooops!)

If I read this properly, I think yes, strcoll() could use mbstowcs() (and strcoll_l() could use mbstowcs_l()) -- provided that the mbstate_t bug in mbstowcs is fixed.

(Probably few people notice this problem, because we don't really update the mbstate_t structures for most encodings.)

Thanks for finding this problem -- I have to take complete blame for screwing this up, as I wrote the faulty code.

I'm happy to take this bug and fix, or to review work by others.

#3

Updated by Rich Lowe over 5 years ago

I'm not seeing the mbstowcs() bug you mean. mbstowcs() just calls the _l variant, and neither 'initial' nor 'mbs' in the _l variant are pointers, so that's structure assignment and a copy.

That said, that also means that's a bit useless, so we shouldn't do it either way.

#4

Updated by Garrett D'Amore over 5 years ago

Ah, I looked again -- the mbs is assigned to from initial (a copy is made), so its ok. Thanks for clarifying that.

#5

Updated by Garrett D'Amore over 5 years ago

And, just for the record, your change looks good.

#6

Updated by Electric Monk over 5 years ago

  • Status changed from New to Closed
  • % Done changed from 40 to 100

git commit 83ff55dcd7fc7c3356d7b2d3f67ec99970728d9b

commit  83ff55dcd7fc7c3356d7b2d3f67ec99970728d9b
Author: Richard Lowe <richlowe@richlowe.net>
Date:   2014-12-03T20:13:41.000Z

    5366 strcoll_l may destroy its arguments, then crash
    Reviewed by: Garrett D'Amore <garrett@damore.org>
    Reviewed by: Robert Mustacchi <rm@joyent.com>
    Approved by: Dan McDonald <danmcd@omniti.com>

Also available in: Atom PDF