View Issue Details

IDProjectCategoryView StatusLast Update
00015481003.1(2016/18)/Issue7+TC2Base Definitions and Headerspublic2022-11-03 18:26
Reportersteffen Assigned To 
PrioritynormalSeverityEditorialTypeEnhancement Request
Status NewResolutionOpen 
Namesteffen
Organization
User Reference
Sectionchapter 7
Page Number135 ff.
Line Number3933 ff.
Interp Status
Final Accepted Text
Summary0001548: Addition of a POSIX.utf-8 locale (likely as 7.3 "POSIX.utf-8 locale")
DescriptionToday's modern POSIX systems use Unicode aware locales, almost all realized via the UTF-8 8-bit character set.
Even though the standard(s) do not offer proper interfaces to deal with this, external libraries (GNU libunicode, ICU) fill this gap in practice.
(These libraries can also be used with UTF-16 and UTF-32 character sets, the latter i think prefers the 16-bit encoding that is in use in a widely distributed commercial non-POSIX operating system.)

As it stands users are using language/territory specific UTF-8 locales in real-life, like en_US.utf8 or de_DE.utf8.
Right now there is no UTF-8 aka Unicode-spectrum POSIX locale available.

Some operating systems (OpenBSD) and some libraries (musl LibC) however, and already, "always work with UTF-8 8-bit" Unicode, giving it the name "C.UTF-8" (musl; and as can be seen, the usual naming scheme mess continues).
Desired ActionProvide and define a POSIX.UTF-8 locale.
This is a major effort (even if covering only character classes).
TagsNo tags attached.

Relationships

related to 0000798 Closed 1003.1(2013)/Issue7+TC1 Addition of a [:symbol:] bracket expression character class expression 
related to 0000795 Closedajosey 1003.1(2013)/Issue7+TC1 Addition of a new «symbol» character class 
related to 0000797 Closed 1003.1(2013)/Issue7+TC1 Addition of a isw?symbol(_l)?() function family 

Activities

nick

2022-06-23 16:24

manager   bugnote:0005867

This was discussed during the 2022-06-23 conference call. We would welcome an addition to the standard that provides a standard UTF-8 locale. Please provide a fully fleshed out proposal.

Don Cragun

2022-11-03 15:10

manager   bugnote:0006021

We would still like to see a proposal for this, but if we do not have a fully fleshed out proposal by 2022-11-13 it will not be possible to include this in Issue 8.

steffen

2022-11-03 18:01

reporter   bugnote:0006024

Given the massive amount of work which would be necessary to get there, including reading of what ISO C (will) say(s) on the UTF-8 topic, this cannot be achieved.

(At least not by me, who spent time on the referenced "symbol" character class additions, which surely would required more work until acceptable for the standard. I would close it if i could.)

calestyo

2022-11-03 18:07

reporter   bugnote:0006026

It shouldn't be closed then but kept open for a future Issue?!

steffen

2022-11-03 18:26

reporter   bugnote:0006027

Sure. If Mantis supports reparenting and it is desired. (I have doubts on the former.)

Issue History

Date Modified Username Field Change
2022-01-11 16:04 steffen New Issue
2022-01-11 16:04 steffen Name => steffen
2022-01-11 16:04 steffen Section => chapter 7
2022-01-11 16:04 steffen Page Number => 135 ff.
2022-01-11 16:04 steffen Line Number => 3933 ff.
2022-06-23 16:24 nick Note Added: 0005867
2022-07-28 16:21 eblake Relationship added related to 0000798
2022-07-28 16:22 eblake Relationship added related to 0000795
2022-07-28 16:22 eblake Relationship added related to 0000797
2022-11-03 15:10 Don Cragun Note Added: 0006021
2022-11-03 18:01 steffen Note Added: 0006024
2022-11-03 18:07 calestyo Note Added: 0006026
2022-11-03 18:26 steffen Note Added: 0006027