Anonymous | Login | 2024-09-07 15:12 UTC |
Main | My View | View Issues | Change Log | Docs |
Viewing Issue Simple Details [ Jump to Notes ] | [ Issue History ] [ Print ] | ||||||
ID | Category | Severity | Type | Date Submitted | Last Update | ||
0000073 | [1003.1(2008)/Issue 7] System Interfaces | Comment | Clarification Requested | 2009-06-28 15:20 | 2019-06-10 08:55 | ||
Reporter | nick | View Status | public | ||||
Assigned To | ajosey | ||||||
Priority | normal | Resolution | Accepted As Marked | ||||
Status | Closed | ||||||
Name | Nick Stoughton | ||||||
Organization | USENIX | ||||||
User Reference | nms-C-wmemcmp | ||||||
Section | wmemcmp | ||||||
Page Number | 2254 | ||||||
Line Number | 70784-70789 | ||||||
Interp Status | Approved | ||||||
Final Accepted Text | Note: 0001089 | ||||||
Summary | 0000073: wmemcmp C conflict? | ||||||
Description |
This issue is for tracking purposes. The following question is being discussed in the C committee at present, and highlights a difference between C-1990 with AMD-1 and C99. POSIX has followed the C89+AMD1 words, and so is possibly at odds with C99. ===> from Joseph Myers When are wide string library functions required to handle values of type wchar_t that do not represent any value in the execution character set, and when does using such values with a library function result in undefined behavior? Consider the following testcase as an example: #include <stdlib.h> #include <wchar.h> wchar_t w0 = WCHAR_MIN; wchar_t w1 = WCHAR_MAX; int main (void) { if (wmemcmp (&w0, &w1, 1) < 0) return 0; else abort (); } Suppose that WCHAR_MIN and WCHAR_MAX do not both represent values in the execution character set. If the arguments to wmemcmp are valid, wmemcmp must return a value less than 0 because 7.24.4.4 says the comparison is done the same way as comparing integers of type wchar_t, so the program must execute successfully. With the GNU C Library, however, it aborts; wchar_t is UTF-32 but has a signed type so WCHAR_MIN is negative and does not represent a member of the execution character set. C90 AMD1 had an explicit statement (7.16.4.6) that made clear that these inputs were valid (and so wmemcmp had to return a value less than 0 for the above example in C90 AMD1): These functions operate on arrays of type wchar_t whose size is specified by a separate count argument. These functions are not affected by locale and all wchar_t values are treated identically. The null wide character and wchar_t values not corresponding to valid multibyte characters are not treated specially. I cannot however find any equivalent statement in C99. Was this a deliberate change from AMD1, or a side-effect of how the functions were rearranged when added to C99? POSIX repeats the above requirement from C90 AMD1, but I believe this is an accident of taking the specification from there originally and is not intended to impose any requirements beyond those of C99. Much the same issue applies to wcscmp and wcsncmp, where the comparison semantics are specified but AMD1 has no mention of wide characters not corresponding to members of the execution character set, and in principle to other wcs* and wmem* functions that have no reason to need to consider the semantics of the characters they process (but are less likely than the comparison functions to have problems with the full set of wchar_t values in practice). |
||||||
Desired Action |
Await for decision from C and if necessary make whatever change to align with the emerging C standard. Issue an interp to describe the discrepancy. |
||||||
Tags | c99, tc2-2008 | ||||||
Attached Files | |||||||
|
Notes | |
(0000280) nick (manager) 2009-11-05 16:20 |
The C committee are considering this. |
(0000603) nick (manager) 2010-11-05 13:23 |
No action has been taken by C on this. This should become a ballot comment on the upcoming C1x CD ballot. |
(0001080) ajosey (manager) 2011-12-15 16:12 |
Joseph Myers reports in mail sequence 16937: My comments BSI 14 on this issue were accepted at the London WG14 meeting and so C1X has the wording "Arguments to the functions in this subclause may point to arrays containing wchar_t values that do not correspond to members of the extended character set. Such values shall be processed according to the specified semantics, except that it is unspecified whether an encoding error occurs if such a value appears in the format string for a function in 7.29.2 or 7.29.5 and the specified semantics do not require that value to be processed by wcrtomb." (7.29.1#5). |
(0001083) nick (manager) 2011-12-15 18:09 edited on: 2012-01-10 03:04 |
I believe that the following should be added to XBD Page 454, line 15482: "Arguments to functions in this list can point to arrays containing wchar_t values that do not correspond to members of the character set of the current locale. Such values shall be processed according to the specified semantics, unless otherwise stated." Add the following sentence "It is unspecified whether an encoding error occurs if the format string contains wchar_t values that do not correspond to members of the character set of the current locale and the specified semantics do not require that value to be processed by wcrtomb()." to: page 973 line 32587 (fwprintf, swprintf, wprintf) page 983 line 32960 (fwscanf, swscanf, wscanf) Add the following sentence "It is unspecified whether an encoding error occurs if the format string contains wchar_t values that do not correspond to members of the character set of the current locale." to page 2207 line 69521 (wcsftime) ---- Since these changes align with C11, I do not believe that any of them need to be CX shaded. I believe that this change should be covered by an interpretation request (defer to another standard) and resolved in TC2. |
(0001089) geoffclare (manager) 2012-01-12 16:15 |
Interpretation response ------------------------ The standard is unclear on this issue, and no conformance distinction can be made between alternative implementations based on this. This is being referred to the sponsor. Rationale: ------------- A clarification has been made in the C11 standard and POSIX will adopt this wording in the next Technical Corrigendum. Since defect reports are no longer accepted against C99 this change in C11 is being taken as if it were a response to a C99 defect report. Notes to the Editor (not part of this interpretation): ------------------------------------------------------- See Note: 0001083 |
(0001303) ajosey (manager) 2012-06-29 16:20 |
Interpretation proposed 29 June 2012 for final 45 day review |
(0001336) ajosey (manager) 2012-08-30 09:09 |
Interpretation approved 30 Aug 2012 |
Issue History | |||
Date Modified | Username | Field | Change |
2009-06-28 15:20 | nick | New Issue | |
2009-06-28 15:20 | nick | Status | New => Under Review |
2009-06-28 15:20 | nick | Assigned To | => ajosey |
2009-06-28 15:20 | nick | Name | => Nick Stoughton |
2009-06-28 15:20 | nick | Organization | => USENIX |
2009-06-28 15:20 | nick | User Reference | => nms-C-wmemcmp |
2009-06-28 15:20 | nick | Section | => wmemcmp |
2009-06-28 15:20 | nick | Page Number | => 2254 |
2009-06-28 15:20 | nick | Line Number | => 70784-70789 |
2009-06-28 15:22 | nick | Tag Attached: real bug not in aardvark | |
2009-08-06 16:19 | nick | Tag Attached: c99 | |
2009-08-13 15:37 | msbrown | Tag Detached: real bug not in aardvark | |
2009-11-05 16:20 | nick | Note Added: 0000280 | |
2010-11-05 13:23 | nick | Note Added: 0000603 | |
2010-11-05 14:03 | nick | Note Added: 0000604 | |
2010-11-05 14:03 | nick | Note Deleted: 0000604 | |
2011-12-15 16:12 | ajosey | Note Added: 0001080 | |
2011-12-15 18:09 | nick | Note Added: 0001083 | |
2012-01-08 18:44 | nick | Note Edited: 0001083 | |
2012-01-08 18:44 | nick | Note View State: public: 1083 | |
2012-01-10 03:04 | nick | Note Edited: 0001083 | |
2012-01-12 16:15 | geoffclare | Interp Status | => Pending |
2012-01-12 16:15 | geoffclare | Note Added: 0001089 | |
2012-01-12 16:15 | geoffclare | Status | Under Review => Interpretation Required |
2012-01-12 16:15 | geoffclare | Resolution | Open => Accepted As Marked |
2012-01-12 16:16 | geoffclare | Final Accepted Text | => Note: 0001089 |
2012-01-12 16:16 | geoffclare | Tag Attached: tc2-2008 | |
2012-06-29 16:20 | ajosey | Interp Status | Pending => Proposed |
2012-06-29 16:20 | ajosey | Note Added: 0001303 | |
2012-08-30 09:09 | ajosey | Interp Status | Proposed => Approved |
2012-08-30 09:09 | ajosey | Note Added: 0001336 | |
2019-06-10 08:55 | agadmin | Status | Interpretation Required => Closed |
Mantis 1.1.6[^] Copyright © 2000 - 2008 Mantis Group |