View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
0001548 | 1003.1(2016/18)/Issue7+TC2 | Base Definitions and Headers | public | 2022-01-11 16:04 | 2022-11-03 18:26 |
Reporter | steffen | Assigned To | |||
Priority | normal | Severity | Editorial | Type | Enhancement Request |
Status | New | Resolution | Open | ||
Name | steffen | ||||
Organization | |||||
User Reference | |||||
Section | chapter 7 | ||||
Page Number | 135 ff. | ||||
Line Number | 3933 ff. | ||||
Interp Status | |||||
Final Accepted Text | |||||
Summary | 0001548: Addition of a POSIX.utf-8 locale (likely as 7.3 "POSIX.utf-8 locale") | ||||
Description | Today's modern POSIX systems use Unicode aware locales, almost all realized via the UTF-8 8-bit character set. Even though the standard(s) do not offer proper interfaces to deal with this, external libraries (GNU libunicode, ICU) fill this gap in practice. (These libraries can also be used with UTF-16 and UTF-32 character sets, the latter i think prefers the 16-bit encoding that is in use in a widely distributed commercial non-POSIX operating system.) As it stands users are using language/territory specific UTF-8 locales in real-life, like en_US.utf8 or de_DE.utf8. Right now there is no UTF-8 aka Unicode-spectrum POSIX locale available. Some operating systems (OpenBSD) and some libraries (musl LibC) however, and already, "always work with UTF-8 8-bit" Unicode, giving it the name "C.UTF-8" (musl; and as can be seen, the usual naming scheme mess continues). | ||||
Desired Action | Provide and define a POSIX.UTF-8 locale. This is a major effort (even if covering only character classes). | ||||
Tags | No tags attached. |
related to | 0000798 | Closed | 1003.1(2013)/Issue7+TC1 | Addition of a [:symbol:] bracket expression character class expression | |
related to | 0000795 | Closed | ajosey | 1003.1(2013)/Issue7+TC1 | Addition of a new «symbol» character class |
related to | 0000797 | Closed | 1003.1(2013)/Issue7+TC1 | Addition of a isw?symbol(_l)?() function family |
|
This was discussed during the 2022-06-23 conference call. We would welcome an addition to the standard that provides a standard UTF-8 locale. Please provide a fully fleshed out proposal. |
|
We would still like to see a proposal for this, but if we do not have a fully fleshed out proposal by 2022-11-13 it will not be possible to include this in Issue 8. |
|
Given the massive amount of work which would be necessary to get there, including reading of what ISO C (will) say(s) on the UTF-8 topic, this cannot be achieved. (At least not by me, who spent time on the referenced "symbol" character class additions, which surely would required more work until acceptable for the standard. I would close it if i could.) |
|
It shouldn't be closed then but kept open for a future Issue?! |
|
Sure. If Mantis supports reparenting and it is desired. (I have doubts on the former.) |
Date Modified | Username | Field | Change |
---|---|---|---|
2022-01-11 16:04 | steffen | New Issue | |
2022-01-11 16:04 | steffen | Name | => steffen |
2022-01-11 16:04 | steffen | Section | => chapter 7 |
2022-01-11 16:04 | steffen | Page Number | => 135 ff. |
2022-01-11 16:04 | steffen | Line Number | => 3933 ff. |
2022-06-23 16:24 | nick | Note Added: 0005867 | |
2022-07-28 16:21 | eblake | Relationship added | related to 0000798 |
2022-07-28 16:22 | eblake | Relationship added | related to 0000795 |
2022-07-28 16:22 | eblake | Relationship added | related to 0000797 |
2022-11-03 15:10 | Don Cragun | Note Added: 0006021 | |
2022-11-03 18:01 | steffen | Note Added: 0006024 | |
2022-11-03 18:07 | calestyo | Note Added: 0006026 | |
2022-11-03 18:26 | steffen | Note Added: 0006027 |