Project

General

Profile

Actions

Bug #14000

closed

lorder: replace sequence a-z by [:lower:]

Added by Toomas Soome 4 months ago. Updated 3 months ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
cmd - userland programs
Start date:
Due date:
% Done:

100%

Estimated time:
Difficulty:
Bite-size
Tags:
Gerrit CR:

Description

$ lorder OBJ/i386-sunos5-gcc/macro.o
OBJ/i386-sunos5-gcc/macro.o OBJ/i386-sunos5-gcc/macro.o
0000002288 t _ZL16init_arch_macrosv
0000002706 t _ZL16init_mach_macrosv
0000009220 t _ZL24expand_value_with_daemonP5_NameP9_PropertyP7_String7BooleanS5_
OBJ/i386-sunos5-gcc/macro.o OBJ/i386-sunos5-gcc/macro.o
0000002288 t _ZL16init_arch_macrosv
0000002706 t _ZL16init_mach_macrosv
0000009220 t _ZL24expand_value_with_daemonP5_NameP9_PropertyP7_String7BooleanS5_
tsoome@beastie:/code/schilytools-code/schily-2021-07-29/sunpro/Make/lib/mksh/src$ env LC_ALL=C  lorder OBJ/i386-sunos5-gcc/macro.o
OBJ/i386-sunos5-gcc/macro.o OBJ/i386-sunos5-gcc/macro.o
OBJ/i386-sunos5-gcc/macro.o OBJ/i386-sunos5-gcc/macro.o
tsoome@beastie:/code/schilytools-code/schily-2021-07-29/sunpro/Make/lib/mksh/src$ 

replacing sequence a-z with [:lower:] will fix sed to use lower case chars. The issue about [a-z] sequence is, in et_EE locale, the collation sequence is .. r s z t u .. and this is the reason why we see those symbols.

char sequence versus charclass issue is about collation order and normally should be controlled by LC_COLLATE (unless LC_ALL is set). For some reason, sed does not seem to accept LC_COLLATE to set collation order for char sequence. However, setting LC_ALL does:

root@beastie:~# locale
LANG=et_EE.UTF-8
LC_CTYPE=et_EE.UTF-8
LC_NUMERIC="et_EE.UTF-8" 
LC_TIME="et_EE.UTF-8" 
LC_COLLATE="et_EE.UTF-8" 
LC_MONETARY="et_EE.UTF-8" 
LC_MESSAGES="et_EE.UTF-8" 
LC_ALL=et_EE.UTF-8
root@beastie:~# echo absztuv | sed 's/[a-z]//g'
tuv
root@beastie:~# echo absztuv | env LC_COLLATE=C sed 's/[a-z]//g'
tuv
root@beastie:~# echo absztuv | env LC_COLLATE=C.UTF-8 sed 's/[a-z]//g'
tuv
root@beastie:~# echo absztuv | env LC_ALL=C.UTF-8 sed 's/[a-z]//g'

root@beastie:~#

And testing with patched lorder:

tsoome@beastie:/code/schilytools-code/schily-2021-07-29/sunpro/Make/lib/mksh/src$ locale
LANG=et_EE.UTF-8
LC_CTYPE=et_EE.UTF-8
LC_NUMERIC="et_EE.UTF-8" 
LC_TIME="et_EE.UTF-8" 
LC_COLLATE="et_EE.UTF-8" 
LC_MONETARY="et_EE.UTF-8" 
LC_MESSAGES="et_EE.UTF-8" 
LC_ALL=et_EE.UTF-8
tsoome@beastie:/code/schilytools-code/schily-2021-07-29/sunpro/Make/lib/mksh/src$ lorder OBJ/i386-sunos5-gcc/macro.o 
OBJ/i386-sunos5-gcc/macro.o OBJ/i386-sunos5-gcc/macro.o
OBJ/i386-sunos5-gcc/macro.o OBJ/i386-sunos5-gcc/macro.o
tsoome@beastie:/code/schilytools-code/schily-2021-07-29/sunpro/Make/lib/mksh/src$ 

Actions #1

Updated by Toomas Soome 4 months ago

  • Description updated (diff)
Actions #2

Updated by Toomas Soome 4 months ago

  • Description updated (diff)
Actions #3

Updated by Toomas Soome 4 months ago

  • Subject changed from lorder: need to use C locale to lorder: replace sequence a-z by [:lower:]
Actions #4

Updated by Toomas Soome 4 months ago

  • Description updated (diff)
Actions #5

Updated by Toomas Soome 4 months ago

  • Description updated (diff)
Actions #6

Updated by Toomas Soome 3 months ago

  • Description updated (diff)
Actions #7

Updated by Electric Monk 3 months ago

  • Status changed from In Progress to Closed
  • % Done changed from 90 to 100

git commit 1b9d0d8668d6fb8bbf6fd07dfeac92665c32895e

commit  1b9d0d8668d6fb8bbf6fd07dfeac92665c32895e
Author: Toomas Soome <tsoome@me.com>
Date:   2021-08-27T08:34:48.000Z

    14000 lorder: replace sequence a-z by [:lower:]
    Reviewed by: Andrew Stormont <andyjstormont@gmail.com>
    Reviewed by: Robert Mustacchi <rm+illumos@fingolfin.org>

Actions

Also available in: Atom PDF