Bug #5158


sed dumps core in new multibyte code

Added by Dan McDonald over 7 years ago. Updated over 7 years ago.

cmd - userland programs
Start date:
Due date:
% Done:


Estimated time:
Gerrit CR:


Try this:

echo "x" | /usr/bin/sed 'y%,%,%'

on any recent illumos or derivative (so far, OmniOS bloody, and OI hipster are affected), and you get a seg fault, with this stack:

core 'core' of 1739:    /usr/bin/sed y%,%,%
feeb68f9 _UTF8_mbsnrtowcs (0, 8046e5c, ffffffff, 0, 0, 3) + 2d
feea8e9d mbsrtowcs_l (0, 8046e5c, 0, 0, fee01a00, 8067d66) + 35
feea8ee1 mbsrtowcs (0, 8046e5c, 0, 0, 0, 0) + 42
08053f2e compile_tr (8067d61, 806a274, 0, fefcac27, fee10048, 38) + 132
0805493c compile_stream (806a260, 804776c, 8047738, 80531aa, 1, 80478c9) + 7b8
08054af3 compile  (0, 3, 0, 0, 3, feffb0a4) + 10
080531b5 main     (804772c, fef72668, 8047760, 805251b, 2, 804776c) + 1a8
0805251b _start   (2, 80478bc, 80478c9, 0, 80478d0, 80478e2) + 83

Related issues

Has duplicate illumos gate - Bug #5448: sed core dumps on y commandRejected2014-12-19

Actions #1

Updated by Dan McDonald over 7 years ago

The "ps" argument (arg4, aka. the 5th argument) to _UTF8_mbsnrtowcs() is NULL, which causes the coredump.

How it gets to be NULL with all of the trampolining is another question.

It appears to be a problem with the en_US.UTF-8 locale, and probably other non-C ones as well. If the LANG is set to C, sed doesn't dump core.

Actions #2

Updated by Garrett D'Amore over 7 years ago

ps is allowed to be NULL, in which case an "internal" state variable should be used.

This doesn't look like anything "new" in the sense that I think it dates pretty much all the way back to the first version of illumos. The utf8.c code in libc needs to check for ps == null, and if it is so, use an internal static state.

The simplest way to handle this would actually be to change utf8.c to check for ps (us in that case) being NULL, and ignore it in that case.

That said, I'm starting to think my handling of this conversion state is not quite right. I think there may be cases where we need to support leaving behind a conversion state that is not "the initial" state.

I'll think about this and update more info shortly.

Actions #3

Updated by Garrett D'Amore over 7 years ago

Fix is easy:

 mbsrtowcs_l(wchar_t *_RESTRICT_KYWD dst, const char **_RESTRICT_KYWD src,
     size_t len, mbstate_t *_RESTRICT_KYWD ps, locale_t loc)
+    static mbstate_t mbs;
+    if (ps == NULL)
+        ps = &mbs;
     return (loc->ctype->lc_mbsnrtowcs(dst, src, ULONG_MAX, len, ps));
Actions #4

Updated by Electric Monk over 7 years ago

  • Status changed from New to Closed
  • % Done changed from 0 to 100

git commit aa64fa16549f13233681f4b40e50fb5b1c18a97c

commit  aa64fa16549f13233681f4b40e50fb5b1c18a97c
Author: Garrett D'Amore <>
Date:   2014-12-20T04:45:07.000Z

    5158 sed dumps core in new multibyte code
    Reviewed by: Toomas Soome <>
    Reviewed by: Robert Mustacchi <>
    Approved by: Richard Lowe <>


Also available in: Atom PDF