Bug #780


gcc-built localedef(1) seems to not work

Added by Rich Lowe about 10 years ago. Updated about 10 years ago.

cmd - userland programs
Start date:
Due date:
% Done:


Estimated time:
Gerrit CR:


Trying a build using gcc as the primary compiler, localedef fails during the build (complaining about most input files, and eventually dumping core).

To reproduce:

% bldenv
% cd usr/src/cmd/localedef
% export __GNUC="" CW_NO_SHADOW=1
% dmake clobber
% dmake install
/usr/bin/perl data/ data/8859-1.TXT >
./localedef -U -i data/da_DK.UTF-8.src -f locale/da_DK.ISO8859-1
data/da_DK.UTF-8.src: 5790: warning: repeated item in order list (first on 1)

It appears to only need a localedef built with gcc to reproduce this failure (the build under localedef/ works fine if I 'unset __GNUC' in my otherwise gcc-built workspace).

Actions #1

Updated by Rich Lowe about 10 years ago

The crash:

> ::status
debugging core file of localedef (32-bit) from
file: /export/home/richlowe/illumos-gcc/usr/src/cmd/localedef/localedef
initial argv: 
./localedef -U -i data/ru_RU.UTF-8.src -f locale/ru_RU.KOI8-R
threading model: native threads
status: process terminated by SIGSEGV (Segmentation Fault), addr=81241e4
> $C
08046d28 end_order+0x292(8046d50, fefc0674, fef80508, 170, 8058590, 806fda4)
08046d58 yyparse+0x737(806efc0, 1ed, 8046d98, 80584ec)
08046d98 main+0x217(7, 8046dc4, 8046de4, 8046db8, 805377a, 8059880)
08046db8 _start+0x83(7, 8046fdc, 8046fe8, 8046feb, 8046fee, 8047003)
> end_order+0x292::dis
end_order+0x273:                pushl  %edi
end_order+0x274:                movl   0x806ed10,%eax   <localedef`currundef>
end_order+0x279:                pushl  0x4(%eax,%esi,4)
end_order+0x27d:                call   -0x8de   <set_pri>
end_order+0x282:                addl   $0x10,%esp
end_order+0x285:                jmp    +0x17    <end_order+0x29e>
end_order+0x287:                subl   $0x4,%esp
end_order+0x28a:                pushl  $0x2
end_order+0x28c:                pushl  %ebx
end_order+0x28d:                movl   0x806ed10,%eax   <localedef`currundef>
end_order+0x292:                pushl  0x4(%eax,%esi,4)   <<<<
end_order+0x296:                call   -0x8f7   <set_pri>
end_order+0x29b:                addl   $0x10,%esp
movl   $-0x1,0x806ed60(,%esi,4) <localedef`order_weights>       <0xffffffff>
end_order+0x2a9:                jmp    +0x22    <end_order+0x2cd>
end_order+0x2ab:                subl   $0x4,%esp
end_order+0x2ae:                pushl  $0x284
end_order+0x2b3:                pushl  $0x805c37c
end_order+0x2b8:                pushl  $0x805c386
end_order+0x2bd:                call   -0x13d1  <PLT:gettext>
end_order+0x2c2:                movl   %eax,(%esp)

Looks to be

    case T_SYMBOL:
        if (((ref = order_weights[i]) < 0) ||
            ((p = get_pri(ref)) == NULL) ||
            (p->pri == -1)) {
            set_pri(currundef->ref[i], pri, RESOLVED)
        } else {
            set_pri(currundef->ref[i], ref, REFER);     <<<<
        order_weights[i] = -1;

Note how i isn't initialized in the T_SYMBOL case.

Actions #2

Updated by Rich Lowe about 10 years ago

diff --git a/usr/src/cmd/localedef/collate.c b/usr/src/cmd/localedef/collate.c
index 277967f..9e5536d 100644
--- a/usr/src/cmd/localedef/collate.c
+++ b/usr/src/cmd/localedef/collate.c
@@ -630,6 +630,7 @@ end_order(void)

     case T_SYMBOL:
+                for (i = 0; i < NUM_WT; i++) {
         if (((ref = order_weights[i]) < 0) ||
             ((p = get_pri(ref)) == NULL) ||
             (p->pri == -1)) {
@@ -638,6 +639,7 @@ end_order(void)
             set_pri(currundef->ref[i], ref, REFER);
         order_weights[i] = -1;
+                }


A hacky guess based purely on the other cases, no longer errors nor crashes.

Actions #3

Updated by Rich Lowe about 10 years ago

  • Status changed from New to In Progress
  • Assignee set to Rich Lowe
Actions #4

Updated by Garrett D'Amore about 10 years ago

Going through the code carefully, your fix looks good to me... this is surprising because I can't imagine how I could have botched this up so badly. (Actually, I think I may know how this happened, but its messy. :-)

I've also gone back and examined the fallout from this bug, and I think that in situations where i == 0, the code basically behaves correctly because of the way the grammar works, the symbols are only ever used with a single weight. (In fact, it isn't clear to me that a definition with multiple weights is even legal with a purely symbolic reference. It might be, and the reasonable behavior can be well-defined, but it might also be out of spec for the strict POSIX grammar. In any event, allowing it is harmless.) I think this may be why I didn't plan to use the for loop, but didn't clean up the code all the way.

So go ahead and proceed with this fix, and thanks for cleaning up my mess. :-)

Actions #5

Updated by Garrett D'Amore about 10 years ago

  • % Done changed from 0 to 50
Actions #6

Updated by Rich Lowe about 10 years ago

  • Status changed from In Progress to Resolved
  • % Done changed from 50 to 100

Resolved in r13303 commit:5e687b8be92a


Also available in: Atom PDF