Austin Group Defect Tracker

Aardvark Mark IV

Viewing Issue Simple Details Jump to Notes ] Issue History ] Print ]
ID Category Severity Type Date Submitted Last Update
0001548 [1003.1(2016/18)/Issue7+TC2] Base Definitions and Headers Editorial Enhancement Request 2022-01-11 16:04 2022-11-03 18:26
Reporter steffen View Status public  
Assigned To
Priority normal Resolution Open  
Status New  
Name steffen
User Reference
Section chapter 7
Page Number 135 ff.
Line Number 3933 ff.
Interp Status ---
Final Accepted Text
Summary 0001548: Addition of a POSIX.utf-8 locale (likely as 7.3 "POSIX.utf-8 locale")
Description Today's modern POSIX systems use Unicode aware locales, almost all realized via the UTF-8 8-bit character set.
Even though the standard(s) do not offer proper interfaces to deal with this, external libraries (GNU libunicode, ICU) fill this gap in practice.
(These libraries can also be used with UTF-16 and UTF-32 character sets, the latter i think prefers the 16-bit encoding that is in use in a widely distributed commercial non-POSIX operating system.)

As it stands users are using language/territory specific UTF-8 locales in real-life, like en_US.utf8 or de_DE.utf8.
Right now there is no UTF-8 aka Unicode-spectrum POSIX locale available.

Some operating systems (OpenBSD) and some libraries (musl LibC) however, and already, "always work with UTF-8 8-bit" Unicode, giving it the name "C.UTF-8" (musl; and as can be seen, the usual naming scheme mess continues).
Desired Action Provide and define a POSIX.UTF-8 locale.
This is a major effort (even if covering only character classes).
Tags No tags attached.
Attached Files

- Relationships
related to 0000798Closed 1003.1(2013)/Issue7+TC1 Addition of a [:symbol:] bracket expression character class expression 
related to 0000795Closedajosey 1003.1(2013)/Issue7+TC1 Addition of a new «symbol» character class 
related to 0000797Closed 1003.1(2013)/Issue7+TC1 Addition of a isw?symbol(_l)?() function family 

-  Notes
nick (manager)
2022-06-23 16:24

This was discussed during the 2022-06-23 conference call. We would welcome an addition to the standard that provides a standard UTF-8 locale. Please provide a fully fleshed out proposal.
Don Cragun (manager)
2022-11-03 15:10

We would still like to see a proposal for this, but if we do not have a fully fleshed out proposal by 2022-11-13 it will not be possible to include this in Issue 8.
steffen (reporter)
2022-11-03 18:01

Given the massive amount of work which would be necessary to get there, including reading of what ISO C (will) say(s) on the UTF-8 topic, this cannot be achieved.

(At least not by me, who spent time on the referenced "symbol" character class additions, which surely would required more work until acceptable for the standard. I would close it if i could.)
calestyo (reporter)
2022-11-03 18:07

It shouldn't be closed then but kept open for a future Issue?!
steffen (reporter)
2022-11-03 18:26

Sure. If Mantis supports reparenting and it is desired. (I have doubts on the former.)

- Issue History
Date Modified Username Field Change
2022-01-11 16:04 steffen New Issue
2022-01-11 16:04 steffen Name => steffen
2022-01-11 16:04 steffen Section => chapter 7
2022-01-11 16:04 steffen Page Number => 135 ff.
2022-01-11 16:04 steffen Line Number => 3933 ff.
2022-06-23 16:24 nick Note Added: 0005867
2022-06-23 16:24 Don Cragun Note Added: 0005868
2022-06-23 16:26 Don Cragun Note Deleted: 0005868
2022-07-28 16:21 eblake Relationship added related to 0000798
2022-07-28 16:22 eblake Relationship added related to 0000795
2022-07-28 16:22 eblake Relationship added related to 0000797
2022-11-03 15:10 Don Cragun Note Added: 0006021
2022-11-03 15:14 calestyo Issue Monitored: calestyo
2022-11-03 18:01 steffen Note Added: 0006024
2022-11-03 18:07 calestyo Note Added: 0006026
2022-11-03 18:26 steffen Note Added: 0006027

Mantis 1.1.6[^]
Copyright © 2000 - 2008 Mantis Group
Powered by Mantis Bugtracker