Project

General

Profile

Feature #11660

common.UTF-8.src should be compiled from CLDR data

Added by Yuri Pankov about 2 months ago. Updated about 1 month ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
locale - data and messages
Start date:
Due date:
% Done:

100%

Estimated time:
Difficulty:
Bite-size
Tags:

Description

Currently, we use common.UTF-8.src that was hand-crafted in FreeBSD. While that was good as a start, its coverage is severely lacking, and it should be compiled from CLDR data we use to create the POSIX locale sources. While there's no 1:1 translation from UnicodeData.txt to a POSIX charmap, it will provide much more complete solution.


Files

ctype.txt (648 KB) ctype.txt diff of character classes against previous definitions Yuri Pankov, 2019-09-16 09:05 PM

History

#1

Updated by Yuri Pankov about 1 month ago

To test this, I'm using C.UTF-8 locale with this change integrated as default system locale, not seeing any issues. Also, a run of available test suites, especially, libc-tests.

#2

Updated by Yuri Pankov about 1 month ago

#3

Updated by Electric Monk about 1 month ago

  • Status changed from In Progress to Closed
  • % Done changed from 50 to 100

git commit 080a98956c44d42821b98e1d0f2ce825925e38db

commit  080a98956c44d42821b98e1d0f2ce825925e38db
Author: Yuri Pankov <yuri.pankov@nexenta.com>
Date:   2019-09-15T12:19:51.000Z

    11660 common.UTF-8.src should be compiled from CLDR data
    Reviewed by: Richard Lowe <richlowe@richlowe.net>
    Approved by: Robert Mustacchi <rm@fingolfin.org>

Also available in: Atom PDF