Bug #132
closedtr breaks scripts
100%
Description
After building and booting Illumos, I tried starting up NetBeans, only to be greeted by:
russ@castle:~$ /usr/netbeans/bin/netbeans
expr: syntax error
/usr/netbeans/bin/netbeans[138]: [: argument expected
/usr/netbeans/bin/netbeans[140]: [: argument expected
Invalid maximum heap size: -Xmxm
Could not create the Java virtual machine.
russ@castle:~$
The culprit is the call to tr in the NetBeans startup script. If I use the Solaris or GNU versions of tr, the line:
prtconf | grep Memory | tr -dc '[0-9]'
gives me:
4064
with no spaces or newlines. If I use Illumos tr, I get the output:
: 4064
with a space at the beginning and end of the line and a newline afterwards.
Workaround: create a modified startup script that forces the use of GNU tr.
Files
Related issues
Updated by Garrett D'Amore over 11 years ago
- Priority changed from Normal to High
This incorrect behavior is confirmed; legacy Solaris tr does not behave this way, nor does the version in xpg4. This must be fixed.
Updated by Chris Love over 11 years ago
It looks like the FreeBSD version of tr isn't correctly interpreting the [0-9] as a range, so the leading '[' and trailing ']' get added to the character set in addition to the digits 0-9 prior to inverting it. Instead it's falling into some code that would handle sequences of characters: [x*n]. The relevant functions are setup(), next() and bracket().
There's a related case which fails with this version of tr but works with legacy Solaris tr (at least for LANG=C): tr -dc '[0123456789]'
Updated by Chris Love over 11 years ago
- File trtests.sh trtests.sh added
Additional testing with the attached script suggests a lot of other issues lurking.
Updated by John Grafton over 11 years ago
The current version of tr uses a splay tree algorithm to keep track of character modifications. Unfortunately, in porting the FreeBSD version to Illumos, the splay tree algorithm began to behave erratically. I opted to port the significantly simpler NetBSD version of tr and with minor edits, it passes all of the above tests posted by Chris.
I submitted a webrev with the tr modifications.
Updated by Rich Lowe over 11 years ago
I looked into debugging this this morning, as a user complained about netbeans.
It looks like the precedence of the operations in cset_in_hard was screwed up while gagging the compiler.
(diff from original FreeBSD code):
@@ -153,11 +150,11 @@
struct csclass *csc;
for (csc = cs->cs_classes; csc != NULL; csc = csc->csc_next)
- if (csc->csc_invert ^ iswctype(ch, csc->csc_type) != 0)
+ if ((csc->csc_invert ^ iswctype(ch, csc->csc_type)) != 0)
return (cs->cs_invert ^ true);
if (cs->cs_root != NULL) {
cs->cs_root = cset_splay(cs->cs_root, ch);
- return (cs->cs_invert ^ cset_rangecmp(cs->cs_root, ch) == 0);
+ return ((cs->cs_invert ^ cset_rangecmp(cs->cs_root, ch)) == 0);
}
return (cs->cs_invert ^ false);
}
If memory serves, == and != bind more tightly than ^, so the compiler silencing parens here change behaviour for the worse. Placing the parens along precedence lines appears to cause this to behave in a way I believe to be correct:
diff -r e019ccb2cb3b usr/src/cmd/tr/cset.c
--- a/usr/src/cmd/tr/cset.c Fri Sep 03 09:10:51 2010 -0400
+++ b/usr/src/cmd/tr/cset.c Tue Sep 07 08:27:15 2010 -0400
@@ -150,11 +150,11 @@
struct csclass *csc;
for (csc = cs->cs_classes; csc != NULL; csc = csc->csc_next)
- if ((csc->csc_invert ^ iswctype(ch, csc->csc_type)) != 0)
+ if (csc->csc_invert ^ (iswctype(ch, csc->csc_type) != 0))
return (cs->cs_invert ^ true);
if (cs->cs_root != NULL) {
cs->cs_root = cset_splay(cs->cs_root, ch);
- return ((cs->cs_invert ^ cset_rangecmp(cs->cs_root, ch)) == 0);
+ return (cs->cs_invert ^ (cset_rangecmp(cs->cs_root, ch) == 0));
}
return (cs->cs_invert ^ false);
}
% prtconf | grep Memory| ./tr -dc '[0123456789]' 5375%
Updated by Garrett D'Amore over 11 years ago
So I screwed it up. (This is why I hate using a coding style where you have to recall specific operator precedence -- the ()'s should always have been explicit in the first place, IMO!)
Thanks for figuring this out Rich.
Lets get this integrated asap.
Updated by Rich Lowe over 11 years ago
case 'g' in trtests echo "[[[[[[[]]]]]]" | tr -d '[=]=]'
is a separate bug, in bracket().
When it calls strchr to find the end of the class it finds the content, feels it's looking at [=], and that it isn't an equivalence class.
I think that's it being off by one when trying to skip over stuff:
case '=': /* "[=equiv=]" */
if ((p = strchr(s->str + 2, ']')) == NULL)
return (0);
if (*(p - 1) != '=' || p - s->str < 4)
goto repeat;
s->str += 2;
genequiv(s);
return (1);
s->str will be '[=]=]', so I think we'd want +3
on line 2 here so that we skip the content character, too. This issue seems to be in the original FreeBSD code, also, does it work there?
Updated by Rich Lowe over 11 years ago
- Status changed from New to In Progress
- Assignee set to Rich Lowe
Apparently I'm taking this...
Updated by John Grafton over 11 years ago
No, the 'g' test fails in FreeBSD also.
Updated by Rich Lowe over 11 years ago
- Status changed from In Progress to Resolved
- % Done changed from 0 to 100
Should be fixed in r13186 commit:c777be6727c6