towlower/towupper still missing some mappings
While reviewing the fix for issue 992
I compared the compiled upper/lower mappings created by
that fix to the list of characters shown in this file:
and there appear to be several missing.
Just spot checking, I noticed these missing:
COPTIC CAPITAL * | COPTIC SMALL * DESERET CAPITAL * | DESERET SMALL * GLAGOLITIC CAPITAL * | GLAGOLITIC SMALL *
Hereare some handy descriptions of these:
It has been pointed out that case folding is a different thing from toupper/tolower mappings, and I know that. However, they are very similar (case folding is a superset of toupper/tolower) and the above data file is a very handy place to see what all the toupper/tolower mappings are for Unicode.
None of the above are likely to be noticed, but nonetheless, it would be nice if someone could (a) do a more careful comparison of the case folding data with our toupper/tolower results, i.e. with a test program, and then (b) add locale data files to fill in the gaps listed above, plus any others found via comparison.
Updated by Garrett D'Amore about 10 years ago
- Status changed from New to Closed
So, having looked over these, the only ones missing are for alphabets that are not currently in use. I'm not for example interested in figuring out how to support Egyptian or the experimental Deseret alphabet. I'm going to punt on this -- because if these alphabets were important, then there would be CLDR data for them and we could import them correctly. As there isn't, they don't matter.
We won't fix. So closing.