Project

General

Profile

Bug #330

isprint() returns false for "legacy" code, results in bad prompt

Added by Garrett D'Amore about 10 years ago. Updated about 10 years ago.

Status:
Resolved
Priority:
High
Category:
lib - userland libraries
Start date:
2010-10-10
Due date:
% Done:

0%

Estimated time:
Difficulty:
Tags:
Gerrit CR:

Description

So after my mondo locale push, I noticed that sftp, and several other programs, were putting \076 at the end of the prompt instead of '>'. Basically, all libtecla consumers are affected.

Upon analysis, the problem is that isprint() returns false for a variety of characters in non-C locales.

The problem stems from the fact that certain legacy code uses a macro version of isprint(), that relies on these characters having ispunct set.

The UTF-8 locale data doesn't set that for these characters because, after all, they are not punctuation characters but other kinds of symbols. (Although notably the POSIX standard insists that ispunct() return true for them in the POSIX locale.)

Note that programs built with -D_XPG4 or that use the functional versions (by #undef ispunct) get the right value.

The simple solution is to force these characters to be punctuation in the same way that we force certain other characters to be digits or other type data.

At some point, it would be good if libtecla were compiled with the XPG4 versions of the macros, as well, so that it got more accurate type data. I'll file that in a separate bug.


Files

istype.c (1.94 KB) istype.c Garrett D'Amore, 2010-10-10 04:56 PM

History

#1

Updated by Garrett D'Amore about 10 years ago

Here's the fix, and a test program is attached.

diff -r feb49cca530d usr/src/cmd/localedef/ctype.c
--- a/usr/src/cmd/localedef/ctype.c    Sun Oct 10 08:20:45 2010 -0700
+++ b/usr/src/cmd/localedef/ctype.c    Sun Oct 10 16:53:28 2010 -0700
@@ -244,6 +244,20 @@
                 ctn->ctype |= _ISXDIGIT;
             if (strchr(" \t", (char)wc))
                 ctn->ctype |= _ISBLANK;
+
+            /*
+             * Technically these settings are only
+             * required for the C locale.  However, it
+             * turns out that because of the historical
+             * version of isprint(), we need them for all
+             * locales as well.  Note that these are not
+             * necessarily valid punctation characters in
+             * the current language, but ispunct() needs
+             * to return TRUE for them.
+             */
+            if (strchr("!\"'#$%&()*+,-./:;<=>?@[\\]^_`{|}~",
+                (char)wc))
+                ctn->ctype |= _ISPUNCT;
         }

         /*

#2

Updated by Garrett D'Amore about 10 years ago

#3

Updated by Garrett D'Amore about 10 years ago

  • Category set to lib - userland libraries
#4

Updated by Garrett D'Amore about 10 years ago

  • Status changed from New to Resolved

Fix integrated.

#5

Updated by Albert Lee about 10 years ago

  • Status changed from Resolved to Closed
#6

Updated by Garrett D'Amore about 10 years ago

  • Status changed from Closed to Resolved

This needs to stay resolved, not closed.

Also available in: Atom PDF