30 Need iconv

Review Request #225 — Created Oct. 2, 2016 and submitted

gwr
illumos-gate
30
general

30 Need iconv (replace closed source)

See usr/src/test additions.

  • 5
  • 0
  • 21
  • 0
  • 26
Description From Last Updated
cstyle rm rm
cstyle rm rm
cstyle rm rm
cstyle rm rm
It appears that warnings is only ever set? Was it supposed to be used as part of determining some part ... rm rm
rm
  1. Thanks for putting this together, it'll be great to have an OSS version of this tool. While there are some nits, I've tried to distinguish between things that are problems and not by using the issue flag (some of the shell style, for example).

    There's one additional comment: This is missing an entry in the closed-bins exception list.

    1. I think I've handled everything that could be handled quickly.
      There are a few questions for you below.

      Thanks for taking a look!

    2. The follow ups here all look pretty reasonable. Though there are still some outstanding questions in the back and forth that I'd like to make sure are straightened out before I sign off.
  2. usr/src/cmd/iconv/Makefile (Diff revision 1)
     
     
    It might be nice to include Makefile.ctf and build this with ctf support to make debugging easier.
    1. OK. I see now what several commands use that, though also several don't.
      Personally I used dbx or gdb for user-level stuff. Hopefully that still works...

      Not sure if you noticed, but there are some debugging features built-in to dump the AVLs etc.

  3. usr/src/cmd/iconv/charmap.h (Diff revision 1)
     
     
    cstyle looks wrong here.
  4. usr/src/cmd/iconv/charmap.h (Diff revision 1)
     
     

    This doesn't seem to be used, remove it?

  5. usr/src/cmd/iconv/charmap.c (Diff revision 1)
     
     

    Conventionally these should have structure prefixes.

    1. Hm... well much of this was copied from the charmap.c over in localedef, which also didn't use struct member name prefixes.
      That doesn't bother me for a struct that's only used locally like this one.
      Does it really bother you? If not, I'd prefer to just "drop" this issue.

    2. It was already a bit of a pain to go through this, but I guess we can come back to it later.
    3. OK, I added the struct member name prefixes.

  6. usr/src/cmd/iconv/charmap.c (Diff revision 1)
     
     

    Maybe we should use abort instead of assert(0) to make sure that in the case someone ever turns off asserts, it still blows up?

    1. OK. The default case should be impossible to hit, so I don't really care either way.
      I'll change it to abort().

  7. usr/src/cmd/iconv/charmap.c (Diff revision 1)
     
     
    Since where isn't used anywhere in this function, it's allowed to be NULL, so you can optionally drop where.
  8. usr/src/cmd/iconv/charmap.c (Diff revision 1)
     
     

    Can you add a comment about why it's safe to free ssym and esym here? That'll help folks who are going through understand this better.

    At first glance, it's quite weird to see the thing that's not allocating it freeing it, though I suspect that's because we're integrating with yacc here.

    1. Added a comment.

  9. usr/src/cmd/iconv/charmap.c (Diff revision 1)
     
     
    Probably worth a comment here indicating that it's designed to try and find a subset match. At first I was surprised to see the iteration around len.
  10. usr/src/cmd/iconv/iconv.c (Diff revision 1)
     
     
    Is there a reason this is harcoded as opposed to doing something like basename(argv[0]) in main() right away?
  11. usr/src/cmd/iconv/iconv.c (Diff revision 1)
     
     

    It seems worth including the sterror here that indicates why the iconv_open() failed, as otherwise it'll be hard for users to get a sense of what went wrong.

  12. usr/src/cmd/iconv/iconv.c (Diff revision 1)
     
     
    I notice that you're using an off64_t, but I don't see anything that indicates that the rest of this is using 64-bit offset capable functions. Should this actually be being built with large file support?
    
    Has this been tested with files that need that off64_t?
    1. The only thing it uses the offset varible for is error reporting. It's never passed to lseek or anything like that.
      Making it fixed size simplified the numeric formatting "%lld". OK if I "drop" this issue?

    2. I'm actually less concerned with the fact that there's an off64_t, but the bigger issue is that it made me realize that this probably won't work with correctly with large files. So I think we need to test it with the classic >2 Gb or >4 Gb file and actually be setting _FILE_OFFSET_BITS and _LARGEFILE_SOURCE.
      
      Presuming that it does work with >4 Gb files, then it's perfectly fine to drop it.
    3. I'm fine with making this large-file aware.

    4. Added file offset bits 64, closing this issue.

  13. usr/src/cmd/iconv/iconv.c (Diff revision 1)
     
     
    Can you add a comment regarding why it's safe to write this out before we check errno? It seems like we're relying on the downstream callers to never change oleft at all in error, which I'm not sure is guaranteed by iconv(3C), for example.
  14. usr/src/cmd/iconv/iconv.c (Diff revision 1)
     
     
    1. For all of these message strings that are left of the current indent:
      I think it looks worse to break the strings just for indenting,
      so my preference would be to live with the four cstyle complaints.
      You OK with that? Or do you really want all the strings broken onto multiple lines?

  15. usr/src/cmd/iconv/iconv.c (Diff revision 1)
     
     
    Was reminder supposed to be remainder?
  16. usr/src/cmd/iconv/iconv.c (Diff revision 1)
     
     
    I think we're guaranteed that this doesn't overlap, is that right? Otherwise, we may want to use memmove() to be safe.
    1. I don't think these can overlap, but I'll make it memmove anyway.

  17. usr/src/cmd/iconv/iconv.c (Diff revision 1)
     
     
    1. Ditto above (is breaking the lines worse?)

  18. usr/src/cmd/iconv/iconv.c (Diff revision 1)
     
     
    1. Ditto above (is breaking the lines worse?)

  19. usr/src/cmd/iconv/iconv.c (Diff revision 1)
     
     
    1. Ditto above (is breaking the lines worse?)

  20. usr/src/cmd/iconv/iconv.c (Diff revision 1)
     
     
    Isn't this meant to be iconv_errno? Couldn't errno have changed? due to other callers?
  21. usr/src/cmd/iconv/iconv.c (Diff revision 1)
     
     
    What's the point of the perror here given the line above? Won't errno also have been potentially clobbered at this point through the fprintf, etc.?
    1. Looks like that was a "left over". Removed.

  22. usr/src/cmd/iconv/iconv.c (Diff revision 1)
     
     
    Why do we ignore the fwrite failures here? Shouldn't that be fatal like the ones earlier?
  23. usr/src/cmd/iconv/iconv.c (Diff revision 1)
     
     
    While the script doesn't feel that appropriate (but it's what we have for now), the error handling is inappropriate.
    
    We need to at least indicate errors that happened. Otherwise this will silently fail and exit 0, causing users to think that there's nothing there!
    1. I like the script. It feels to me like the "right tool for the job".
      All it has to do it print out a list of supported conversions.
      ("What can possibly go wrong?" :)

    2. I guess doing this in C wouldn't be too bad. Do you prefer that? If so, why?

    3. What I was most concerned with was making sure that we actually properly reported errors. I personally would have done it in C rather than forking and execing because I find it easier to control that environment, but I wasn't going to push the subject.
    4. I might take a whack at a C version anyway.

    5. I got part way through writing a C version and lost my patience with it.
      I think the script is fine, unless someone else wants to re-write it.

    6. OK, I re-wrote that bit in C. It's three times as long, of course, but avoids the exec of a separate script for that.

  24. usr/src/cmd/iconv/iconv_list.sh (Diff revision 1)
     
     
    We presumably need to unalias things.
    1. I think we'll still want to be running the unalias command at the start of things to make sure that random system shell rc files don't mess things up.
  25. usr/src/cmd/iconv/iconv_list.sh (Diff revision 1)
     
     
    General illumos scripting style has traditionally had the do statement on the line with the while/flor.
    1. Never heard of that. OK.

  26. usr/src/cmd/iconv/iconv_list.sh (Diff revision 1)
     
     
    Looks like this whitespace isn't consistent with the rest.
    1. fixed, I think.

  27. usr/src/cmd/iconv/iconv_list.sh (Diff revision 1)
     
     
    General illumos scripting style has traditionally had the do statement on the line with the while/flor.
  28. usr/src/cmd/iconv/iconv_list.sh (Diff revision 1)
     
     

    Looks like this whitespace isn't consistent with the rest.

    1. fixed, I think.

  29. usr/src/cmd/iconv/iconv_list.sh (Diff revision 1)
     
     

    General illumos scripting style has traditionally had the do statement on the line with the while/flor.

  30. usr/src/cmd/iconv/scanner.c (Diff revision 1)
     
     
    Do we need to be checking for strdup failure here or is that being done elsewhere? If the latter, can the comment indicate where it's actually checked?
    1. I guess, though this code is only used with user-supplied charmaps.
      We'll either die here (if I add a test) or die later...
      Does that matter?

    2. Presuming that it doesn't result in a core dump, than it doesn't matter where it happens per se. But causing a core dump because we're out of memory is a call generator, so we'll want to make sure we catch that it happened and properly report and error out.
  31. usr/src/cmd/iconv/scanner.c (Diff revision 1)
     
     

    It appears that warnings is only ever set? Was it supposed to be used as part of determining some part of the exit status?

    1. I forget. Might have been from localedef...
      I'll look.

    2. It was. I use the variable now. Hope that's "the right thing"...
      (I might want to check what the closed iconv does with bad charmaps.)

  32. For this to work, do we need to depend on the packages that have the iconv modules that are being used in the tests?
    1. Yes. Added one.

  33. 
      
gwr
gwr
gwr
gwr
rm
  1. The follow ups here all look pretty reasonable. Though there are still some outstanding questions in the back and forth that I'd like to make sure are straightened out before I sign off.
  2. 
      
gwr
gwr
gwr
gwr
rm
  1. In general, this looks good. A couple minor questions on this, but should be able to wrap this up and integrate it. Thanks again for helping replace the closed bins!

  2. usr/src/cmd/iconv/Makefile (Diff revisions 4 - 9)
     
     
    I think we also need the _LARGEFILE_SOURCE macro set here.
    1. Just _FILE_OFFSET_BITS=64 is sufficient unless the program needs things like fseeko or ftello,
      but I'll add it anyway. No harm.

  3. usr/src/cmd/iconv/iconv_list.c (Diff revision 9)
     
     

    Cast to void or handle for lint? Same with the other printf calls in this function.

    1. yeah, I hadn't cleaned up lint yet. done now.

  4. usr/src/cmd/iconv/iconv_list.c (Diff revision 9)
     
     
    Is a 256 byte dirent large enough? Should we instead be using a constant? dirent.h(3HEAD) seems to refer to NAME_MAX, while head/dirent.h seems to have a MAXNAMELEN of 512? I guess it's unlikely to matter immediately.
    1. Yes, should be MAXNAMELEN

  5. usr/src/cmd/iconv/iconv_list.c (Diff revision 9)
     
     
    Should we worry about truncation or just case to void?
  6. 
      
gwr
gwr
rm
  1. Ship It!
  2. 
      
yuripv
  1. closed source iconv has better error message in case the supplied conversion is invalid:

    altair:yuri:~$ echo привет | iconv -t utf64
    Not supported UTF-8 to utf64
    altair:yuri:~$ echo привет | ~/ws/il-iconv/proto/root_i386/usr/bin/iconv -t utf64
    iconv_open failed: Invalid argument
    altair:yuri:~$

    1. Will fix. Any other messages that should be better?

  2. 
      
yuripv
  1. nitpicking (really) - iconv -l has a trailing comma on the last line

    1. Yeah, I'm gonna ignore that. :)

  2. 
      
gwr
gwr
yuripv
  1. Ship It!
  2. 
      
gwr
Review request changed

Status: Closed (submitted)

Loading...