Project

General

Profile

Actions

Bug #7962

closed

strxfrm() fails for certain characters

Added by Yuri Pankov about 6 years ago. Updated over 5 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
locale - data and messages
Start date:
2017-03-11
Due date:
% Done:

100%

Estimated time:
Difficulty:
Bite-size
Tags:
Gerrit CR:
External Bug:

Description

We fail to create correct entries for these:

<LATIN_CAPITAL_LETTER_A_WITH_RING_ABOVE>                   <X2700>;"<X05><X99>";"<X8F><X05>";"<A><COMBINING_RING_ABOVE>" 
<ANGSTROM_SIGN>                                            <X2700>;"<X05><X99>";"<X8F><X05>";"<A><COMBINING_RING_ABOVE>" 


Files

strxfrm-7962.c (487 Bytes) strxfrm-7962.c Yuri Pankov, 2017-03-11 05:55 PM
Actions #1

Updated by Yuri Pankov about 6 years ago

$ ./strxfrm-7962
strxfrm-7962: strxfrm() failed for wc=0xc5 char='Å': Invalid argument
strxfrm-7962: strxfrm() failed for wc=0x212b char='Å': Invalid argument
Actions #2

Updated by Yuri Pankov almost 6 years ago

Took a look at this..

Those are the only entries that have final weight defined as <char><char>, and we treat it as substitution. The problem, however, is that while the priority for <A> is well defined, <COMBINING_RING_ABOVE> doesn't show up anywhere in collation order list. Not sure why these two use what looks to be the decomposed form of <LATIN_CAPITAL_LETTER_A_WITH_RING_ABOVE>.

IMO, this is neither a bug in our localedef, nor it's a bug in our *xfrm() functions, and should be reported upstream - probably an artifact of converting the CLDR data to POSIX format.

Actions #4

Updated by Yuri Pankov almost 6 years ago

  • Status changed from New to In Progress
  • Assignee set to Yuri Pankov
  • % Done changed from 0 to 50
  • Difficulty changed from Medium to Bite-size
  • Tags deleted (needs-triage)

proposing the temporary fix until upstream data is updated.

Actions #5

Updated by Electric Monk almost 6 years ago

  • Status changed from In Progress to Closed
  • % Done changed from 50 to 100

git commit f862e02cb8d597e430ef5067be483718a65c3370

commit  f862e02cb8d597e430ef5067be483718a65c3370
Author: Yuri Pankov <yuri.pankov@nexenta.com>
Date:   2017-06-28T15:21:31.000Z

    7962 strxfrm() fails for certain characters
    Reviewed by: Toomas Soome <tsoome@me.com>
    Reviewed by: Igor Kozhukhov <igor@dilos.org>
    Reviewed by: Garrett D'Amore <garrett@damore.org>
    Approved by: Dan McDonald <danmcd@joyent.com>

Actions #6

Updated by Yuri Pankov over 5 years ago

Just for the record, this didn't make it into CLDR v32.

Actions

Also available in: Atom PDF