Need mbrtowc variant that indicates consumed zero bytes
To properly implement support for open_wmemstream() we need to be able to write embedded zero byte wide characters. Unfortunately, the interfaces that we have today have no way of indicating how many bytes was actually consumed when the zero is translated. This makes it very difficult to correctly write a program here as there is no guarantee a locale uses a single byte and in fact looking through many of the locales, it is not always the case that there is a single byte. To rectify this, I'd like to add a private function to libc which does a variant of mbrtowc() which understands this. We could consider making this public, but to do so we should work with the broader open source communities to agree on a name for this.
To summarize, this allows a variant of mbrtowc() to always indicate the number of bytes consumed, even if it results in a wide-zero which would normally only indicate 'zero'.
Updated by Robert Mustacchi about 1 year ago
To test this, I imported the OpenBSD mbrtowc regression tests into the stdio test suite as a part of (7092). open_wmemstream(3C) uses this extensively internally, giving me enhanced confidence of the change. I also have been using the en_US.UTF-8 locale while doing work on a system with these changes.
Updated by Electric Monk about 1 year ago
- Status changed from New to Closed
- % Done changed from 90 to 100
commit 0ac311bae7f6f50d9ba506b52bd8860f2d68d4ce Author: Robert Mustacchi <email@example.com> Date: 2020-03-26T07:42:53.000Z 12358 Need mbrtowc variant that indicates consumed zero bytes Reviewed by: John Levon <firstname.lastname@example.org> Reviewed by: Yuri Pankov <email@example.com> Approved by: Dan McDonald <firstname.lastname@example.org>