View Issue Details
| ID | Project | Category | View Status | Date Submitted | Last Update |
|---|---|---|---|---|---|
| 0000110 | 1003.1(2008)/Issue 7 | System Interfaces | public | 2009-06-30 19:39 | 2013-04-16 13:06 |
| Reporter | eblake | Assigned To | ajosey | ||
| Priority | normal | Severity | Objection | Type | Omission |
| Status | Closed | Resolution | Accepted As Marked | ||
| Name | Eric Blake | ||||
| Organization | |||||
| User Reference | |||||
| Section | memchr | ||||
| Page Number | 1284 | ||||
| Line Number | 42163 | ||||
| Interp Status | |||||
| Final Accepted Text | 0000110:0000143 | ||||
| Summary | 0000110: memchr input process order | ||||
| Description | _____________________________________________________________________________ OBJECTION Enhancement Request Number 39 ebb9:xxxxxxx Defect in XSH memchr (rdvk# 1) {ebb.memchr} Wed, 27 May 2009 13:58:36 +0100 (BST) _____________________________________________________________________________ Traditional implementations of memchr process the input in ascending order. This has the advantage that when the object size of s is not known, but c occurs within the object, the caller can pass a value of n that is larger than the actual object size without dereferencing inaccessible memory. However, while the standard (and C99) is explicit that it is permissible to pass n smaller than the object size of s, it is silent on whether passing a larger n is well-defined. In contrast, consider the wording for fprintf when dealing with the %.*s specifier, from line 29938: "If the precision is not specified or is greater than the size of the array, the application shall ensure that the array contains a null byte." Many implementations of the *printf family use memchr to implement this statement; for example, http://git.sv.gnu.org/cgit/gnulib.git/tree/lib/vasnprintf.c?id=d4ca645#n197 However, if memchr does not have any strict requirement on evaluation order, then this invokes undefined behavior. For example, here is a bug report showing what happens when memchr does not have the traditional behavior, but dereferences memory that fits with the n argument to memchr but not within the actual array passed to printf: http://www.alphalinux.org/archives/axp-list/March2001/0337.shtml Likewise, application writers have noticed that it is possible to write faster code for finding a NUL byte, if one is present within a bounded length, by using memchr rather than strnlen, since the former has fewer conditionals (bounds check and search for NUL) than the latter (bounds check, search for NUL, and search for c). For example: http://git.sv.gnu.org/cgit/gnulib.git/tree/lib/strnlen1.c?id=d4ca645 But again, this usage is rendered unsafe unless memchr is specified to behave like strnlen and not dereference past the match. | ||||
| Desired Action | At the end of the paragraph at line 42164, append a sentence with CX shading: If n is larger than the object pointed to by s, the application shall ensure that an instance of c occurs within the object. Change the rationale at line 42174 from: None. to: Although C99 is silent on the behavior of memchr when s points to an array smaller than n bytes, this specification requires memchr to behave as if it accesses bytes in ascending order, thus making memchr(s,0,n) safe to use as a faster alternative to strnlen(s,n) when determining if the end of a null-terminated string occurs within n bytes. According to ebb9:xxxxxxx on 5/27/2009 6:58 AM: > Likewise, application writers have noticed that it is possible to > write faster code for finding a NUL byte, if one is present within a > bounded length, by using memchr rather than strnlen, since the former > has fewer conditionals (bounds check and search for NUL) than the > latter (bounds check, search for NUL, and search for c). Correction - I meant to compare memchr(s,c,n) to strchr(s,c) where c is known to occur in s; strchr requires a search for c and for NUL, and the search for two bytes in parallel is typically more expensive than a bounds check and single search. There is no strnchr, so nothing is standardized that performs all three of bounds check, search for NUL, and search for c at once. (That behavior is also useful--for example, gnulib provides a function memchr2--but it can wait for another day to be standardized). But one point remains - many applications use memchr(s,0,n) rather than strnlen(s,n) because strnlen was not present in earlier standards. So this aardvark is still useful in standardizing this relationship. > Change the rationale at line 42174 from: > > None. > > to: > > Although C99 is silent on the behavior of memchr when s points to an > array smaller than n bytes, this specification requires memchr to > behave as if it accesses bytes in ascending order, thus making > memchr(s,0,n) safe to use as a faster alternative to strnlen(s,n) when > determining if the end of a null-terminated string occurs within n > bytes. Therefore, we may want to strike the word 'faster' in this proposed rationale. | ||||
| Tags | c99, tc1-2008 | ||||
|
|
In the DESCRIPTION remove "of the object" from The memchr( ) function shall locate the first occurrence of c (converted to an unsigned char) in the initial n bytes (each interpreted as unsigned char) of the object pointed to by s. In the RETURN VALUE section The memchr( ) function shall return a pointer to the located byte, or a null pointer if the byte does not occur in the object. to The memchr( ) function shall return a pointer to the located byte, or a null pointer if the byte is not found. Also Nick will let the C committee know about the issue Add to DESCRIPTION Implementations shall behave as if they read the memory byte by byte from the beginning of the bytes pointed to by s and stop at the first occurrence of c (if it is found in the initial n bytes). |
|
|
Note that the "Final Accepted Text" field contains two chunks of edits to the standard, but they are separated by an informative sentence ("Also Nick will let the C committee know about the issue") that should not be placed in the standard. |
|
|
WG14 has added "Implementations shall behave as if they read the memory byte by byte from the beginning of the bytes pointed to by s and stop at the first occurrence of c (if it is found in the initial n bytes)." to the description of memchr in the C1x draft. |
| Date Modified | Username | Field | Change |
|---|---|---|---|
| 2009-06-30 19:39 | msbrown | New Issue | |
| 2009-06-30 19:39 | msbrown | Status | New => Under Review |
| 2009-06-30 19:39 | msbrown | Assigned To | => ajosey |
| 2009-06-30 19:39 | msbrown | Name | => Mark Brown |
| 2009-06-30 19:39 | msbrown | Organization | => IBM |
| 2009-06-30 19:39 | msbrown | Section | => memchr |
| 2009-06-30 19:39 | msbrown | Page Number | => 1284 |
| 2009-06-30 19:39 | msbrown | Line Number | => 42163 |
| 2009-06-30 19:39 | msbrown | Note Added: 0000143 | |
| 2009-06-30 19:39 | msbrown | Status | Under Review => Resolved |
| 2009-06-30 19:39 | msbrown | Resolution | Open => Accepted As Marked |
| 2009-06-30 19:40 | msbrown | Final Accepted Text | => 0000110:0000143 |
| 2009-06-30 20:03 | eblake | Note Added: 0000144 | |
| 2009-07-01 16:44 | Don Cragun | Name | Mark Brown => Eric Blake |
| 2009-07-01 16:44 | Don Cragun | Organization | IBM => |
| 2009-07-01 16:44 | Don Cragun | Reporter | msbrown => eblake |
| 2009-08-06 16:24 | nick | Tag Attached: c99 | |
| 2010-08-27 13:18 | ajosey | Tag Attached: tc1-2008 | |
| 2010-11-05 14:35 | nick | Note Added: 0000607 | |
| 2013-04-16 13:06 | ajosey | Status | Resolved => Closed |