Project

General

Profile

Bug #1030

strftime "%x" should generate something strptime can parse

Added by Gordon Ross over 8 years ago. Updated over 8 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
locale - data and messages
Start date:
2011-05-13
Due date:
% Done:

0%

Estimated time:
Difficulty:
Medium
Tags:
needs-triage

Description

LANG=en_US.UTF-8 date '+%x'
has wrong output in the first 9 days of the month:

$ date '+%x'
05/ 6/11

(Note the space before the numeral six)

This breaks programs. For example, it prevents Evolution from
creating a calendar appointment.

The problem is that strftime generates something that strptime can not parse.


Files

strptime.c (1.16 KB) strptime.c Gordon Ross, 2011-05-13 08:55 PM
illumos1030.patch (578 Bytes) illumos1030.patch Gordon Ross, 2011-06-12 04:48 PM

History

#1

Updated by Gordon Ross over 8 years ago

The obvious fix for this would be:

diff -r 52f259cd114f usr/src/cmd/localedef/data/en_US.UTF-8.src
--- a/usr/src/cmd/localedef/data/en_US.UTF-8.src    Mon May 09 22:31:13 2011 -0400
+++ b/usr/src/cmd/localedef/data/en_US.UTF-8.src    Fri May 13 12:28:02 2011 -0400
@@ -7117,7 +7117,7 @@
         "<N><o><v><e><m><b><e><r>";/
         "<D><e><c><e><m><b><e><r>" 

-d_fmt    "%m<SOLIDUS>%e<SOLIDUS>%y" 
+d_fmt    "%m<SOLIDUS>%d<SOLIDUS>%y" 

 t_fmt    "%I:%M:%S %p" 

However, several localedef data files appear to use %e where %d
might have been expected. These all need further review.

#2

Updated by Garrett D'Amore over 8 years ago

I would argue that it is very very wrong for Evolution to be parsing this format, as it is highly locale sensitive.

However, I'm in agreement that %e is probably wrong here... we got this data from Unicode.org, and it seems many locales suffer a similar problem.

I'm not sure what the best resolution here is. We should check the Unicode 6.0 drop and see if it is any better.

#3

Updated by Gordon Ross over 8 years ago

It looks like in this case, the application is using the pair of functions:
strftime(3c), strptime(3c), to implement an editable "time" field, and
quite resonably expects strptime to be able to parse strftime results.
So maybe it's a strptime bug. Not sure which.

#4

Updated by Gordon Ross over 8 years ago

attached test program to demonstrate: strptime.c

#5

Updated by Gordon Ross over 8 years ago

  • Subject changed from date '+%x' has wrong output to strftime "%x" should generate something strptime can parse
#6

Updated by Garrett D'Amore over 8 years ago

so the problem appears to be in strptime's implementation of %e. It does not handle a leading space.

#7

Updated by Gary Mills over 8 years ago

I would say that the bug is in strftime. It should never
produce a condensed date string that contains an internal
space. Here's what I get now:

$ env LC_TIME=en_CA.ISO8859-1 date +%x
21/05/11
$ env LC_TIME=en_US.ISO8859-1 date +%x
05/21/11

If the day of the month in the second example has a
leading space, we have a problem. Now, strftime() has
no option letter to print single-digit day numbers. The only
choices are with a leading zero or leading space. Unless
strftime() can be modified to compress out spaces after
formatting, the only reasonable choice is the leading zero.

#8

Updated by Gordon Ross over 8 years ago

It does seem that the "%x" format is more commonly defined as
some permutation of "%Y", "%m", "%d" (and not "%e"), so
the result of %x will usually not contain spaces.

I did a "grep d_fmt" in $SRC/cmd/localedef/data
and the result is an interesting hodge-podge.
In general, the locale-specific aspects of d_fmt
are (1) the order, and (2) the separators.

Beyond that, most (all?) should use %Y rather than %y,
and where the day of the month is somewhere other than
the start of the d_fmt string, probably should be %d so
there will not be spaces within the result. (A space
at the front is probably harmless, so locales that use
d_fmt=%e/%m/%Y (for example) are probably fine.)

I suggest we cleanup the d_fmt definitions per above.
[ Opened new issue for this cleanup: 1198
https://www.illumos.org/issues/1198 ]

(And yes, the parser for %e really should deal with the
leading space, but that does not make the above d_fmt
definitions any less ugly:)

#9

Updated by Gordon Ross over 8 years ago

Attached a suggested fix for review: illumos1030.patch

Before the fix:

root@sun40# LANG=en_US.UTF-8 /home/gwr/strptime
strftime: 05/13/11
strftime: 05/ 6/11
strptime: parse error
root@sun40#

After the fix:

root@sun40z1# LANG=en_US.UTF-8 ./strptime
strftime: 05/13/11
strftime: 05/ 6/11
root@sun40z1# 

And make sure pkg:/locale/en is installed, or else the
test will use LANG=C and not reproduce the bug.

#10

Updated by Gordon Ross over 8 years ago

  • Status changed from In Progress to Resolved
changeset:   13398:fa0b6e3a91f5
tag:         tip
user:        Gordon Ross <gwr@nexenta.com>
date:        Mon Jul 11 16:32:19 2011 -0400
description:
    1030 strftime "%x" should generate something strptime can parse
    Reviewed by: Garrett D'Amore <garrett@nexenta.com>
    Reviewed by: Gary Mills <mills@cc.umanitoba.ca>
    Reviewed by: Igor Kozhukhov <ikozhukhov@gmail.com>
    Approved by: Garrett D'Amore <garrett@nexenta.com>

Also available in: Atom PDF