Bug #1314

localedef/ctype.sh requires sed -E for no reason

Added by Albert Lee over 2 years ago. Updated over 2 years ago.

Status:Resolved Start date:2011-08-01
Priority:High Due date:
Assignee:Albert Lee % Done:

100%

Category:cmd - userland programs Spent time: -
Target version:-
Difficulty:Bite-size Tags:

Description

The introduction of ctype.sh in cset 13399:a1d28d03839f added a dependency on illumos sed's -E option for extended regex (inherited from FreeBSD). -E was not previously supported by Solaris sed and this has led to an additional bootstrapping step on some platforms (announced as a "flag day"). The use of -E seems to have been purely a stylistic decision as the same sed commands can use basic RE with only minor syntax changes (brackets for char matches, backslash before parens for substrings). The script itself also look like could be simplified.

ctype.sh.diff (767 Bytes) Yuri Pankov, 08/03/2011 12:27 pm

ctype.sh.diff (770 Bytes) Yuri Pankov, 08/03/2011 12:42 pm

History

Updated by Yuri Pankov over 2 years ago

My fault.. Diff attached, no differences in the output, not sure about "could be simplified" though..

Updated by Yuri Pankov over 2 years ago

Really "fixed" version...

Updated by Albert Lee over 2 years ago

Thanks, Yuri. Do you want to start an RTI?

Updated by Albert Lee over 2 years ago

As for "simplifying" it, I really just don't understand the overall transformation being done here. The use of the buffer swap and purpose of adding ";/" and removing it in the first expression, and the "s,\([>)]\)$,\1;/," in the second is still mysterious to me. I'm looking at convert_map.pl which does some related parsing...

Updated by Yuri Pankov over 2 years ago

First sed invocation is for corner-case (currently found only in hy_AM.UTF-8.src) - "alpha <ARMENIAN_MODIFIER_LETTER_LEFT_HALF_RING>", as sed can't match both range patterns on the same line... we make it look like:

alpha   <ARMENIAN_MODIFIER_LETTER_LEFT_HALF_RING>;/
      <ARMENIAN_MODIFIER_LETTER_LEFT_HALF_RING>

so it could be parsed by the second sed call...

"s,\([>)]\)$,\1;/," makes all definitions look like <......>;/, so the data gathered from all files looks continuous to localedef...

last sed call adds keyword (upper, lower, etc.) to the first line and removes ;/ from the last one, so we get something like the following:

upper <DEFINITION>;/
      <DEFINITION>;/
[....]
      <DEFINITION>;/
      <DEFINITION>

Updated by Garrett D'Amore over 2 years ago

  • Status changed from New to Resolved
  • % Done changed from 0 to 100
  • Tags deleted (needs-triage)

Fixed in:

changeset: 13419:71db23f03404
tag: tip
user: Yuri Pankov <>
date: Thu Aug 04 17:38:16 2011 -0700
description:
1314 localedef/ctype.sh requires sed -E for no reason
Reviewed by: Albert Lee <>
Reviewed by: Garrett D'Amore <>
Approved by: Garrett D'Amore <>

Updated by Yuri Pankov over 2 years ago

Thanks to Albert for looking at this and sorry for breaking build for such a long time...

Also available in: Atom PDF