0000110: memchr input process order - Austin Group Issue Tracker

ID	Project	Category	View Status	Date Submitted	Last Update

0000110	1003.1(2008)/Issue 7	System Interfaces	public	2009-06-30 19:39	2013-04-16 13:06

Reporter	eblake	Assigned To	ajosey
Priority	normal	Severity	Objection	Type	Omission
Status	Closed	Resolution	Accepted As Marked

Name	Eric Blake
Organization
User Reference
Section	memchr
Page Number	1284
Line Number	42163
Interp Status
Final Accepted Text	0000110:0000143


Summary	0000110: memchr input process order
Description	_____________________________________________________________________________ OBJECTION Enhancement Request Number 39 ebb9:xxxxxxx Defect in XSH memchr (rdvk# 1) {ebb.memchr} Wed, 27 May 2009 13:58:36 +0100 (BST) _____________________________________________________________________________ Traditional implementations of memchr process the input in ascending order. This has the advantage that when the object size of s is not known, but c occurs within the object, the caller can pass a value of n that is larger than the actual object size without dereferencing inaccessible memory. However, while the standard (and C99) is explicit that it is permissible to pass n smaller than the object size of s, it is silent on whether passing a larger n is well-defined. In contrast, consider the wording for fprintf when dealing with the %.s specifier, from line 29938: "If the precision is not specified or is greater than the size of the array, the application shall ensure that the array contains a null byte." Many implementations of the printf family use memchr to implement this statement; for example, http://git.sv.gnu.org/cgit/gnulib.git/tree/lib/vasnprintf.c?id=d4ca645#n197 However, if memchr does not have any strict requirement on evaluation order, then this invokes undefined behavior. For example, here is a bug report showing what happens when memchr does not have the traditional behavior, but dereferences memory that fits with the n argument to memchr but not within the actual array passed to printf: http://www.alphalinux.org/archives/axp-list/March2001/0337.shtml Likewise, application writers have noticed that it is possible to write faster code for finding a NUL byte, if one is present within a bounded length, by using memchr rather than strnlen, since the former has fewer conditionals (bounds check and search for NUL) than the latter (bounds check, search for NUL, and search for c). For example: http://git.sv.gnu.org/cgit/gnulib.git/tree/lib/strnlen1.c?id=d4ca645 But again, this usage is rendered unsafe unless memchr is specified to behave like strnlen and not dereference past the match.
Desired Action	At the end of the paragraph at line 42164, append a sentence with CX shading: If n is larger than the object pointed to by s, the application shall ensure that an instance of c occurs within the object. Change the rationale at line 42174 from: None. to: Although C99 is silent on the behavior of memchr when s points to an array smaller than n bytes, this specification requires memchr to behave as if it accesses bytes in ascending order, thus making memchr(s,0,n) safe to use as a faster alternative to strnlen(s,n) when determining if the end of a null-terminated string occurs within n bytes. According to ebb9:xxxxxxx on 5/27/2009 6:58 AM: > Likewise, application writers have noticed that it is possible to > write faster code for finding a NUL byte, if one is present within a > bounded length, by using memchr rather than strnlen, since the former > has fewer conditionals (bounds check and search for NUL) than the > latter (bounds check, search for NUL, and search for c). Correction - I meant to compare memchr(s,c,n) to strchr(s,c) where c is known to occur in s; strchr requires a search for c and for NUL, and the search for two bytes in parallel is typically more expensive than a bounds check and single search. There is no strnchr, so nothing is standardized that performs all three of bounds check, search for NUL, and search for c at once. (That behavior is also useful--for example, gnulib provides a function memchr2--but it can wait for another day to be standardized). But one point remains - many applications use memchr(s,0,n) rather than strnlen(s,n) because strnlen was not present in earlier standards. So this aardvark is still useful in standardizing this relationship. > Change the rationale at line 42174 from: > > None. > > to: > > Although C99 is silent on the behavior of memchr when s points to an > array smaller than n bytes, this specification requires memchr to > behave as if it accesses bytes in ascending order, thus making > memchr(s,0,n) safe to use as a faster alternative to strnlen(s,n) when > determining if the end of a null-terminated string occurs within n > bytes. Therefore, we may want to strike the word 'faster' in this proposed rationale.
Tags	c99, tc1-2008

msbrown 2009-06-30 19:39 manager bugnote:0000143	In the DESCRIPTION remove "of the object" from The memchr( ) function shall locate the first occurrence of c (converted to an unsigned char) in the initial n bytes (each interpreted as unsigned char) of the object pointed to by s. In the RETURN VALUE section The memchr( ) function shall return a pointer to the located byte, or a null pointer if the byte does not occur in the object. to The memchr( ) function shall return a pointer to the located byte, or a null pointer if the byte is not found. Also Nick will let the C committee know about the issue Add to DESCRIPTION Implementations shall behave as if they read the memory byte by byte from the beginning of the bytes pointed to by s and stop at the first occurrence of c (if it is found in the initial n bytes).

eblake 2009-06-30 20:03 manager bugnote:0000144	Note that the "Final Accepted Text" field contains two chunks of edits to the standard, but they are separated by an informative sentence ("Also Nick will let the C committee know about the issue") that should not be placed in the standard.

nick 2010-11-05 14:35 manager bugnote:0000607	WG14 has added "Implementations shall behave as if they read the memory byte by byte from the beginning of the bytes pointed to by s and stop at the first occurrence of c (if it is found in the initial n bytes)." to the description of memchr in the C1x draft.

Date Modified	Username	Field	Change
2009-06-30 19:39	msbrown	New Issue
2009-06-30 19:39	msbrown	Status	New => Under Review
2009-06-30 19:39	msbrown	Assigned To	=> ajosey
2009-06-30 19:39	msbrown	Name	=> Mark Brown
2009-06-30 19:39	msbrown	Organization	=> IBM
2009-06-30 19:39	msbrown	Section	=> memchr
2009-06-30 19:39	msbrown	Page Number	=> 1284
2009-06-30 19:39	msbrown	Line Number	=> 42163
2009-06-30 19:39	msbrown	Note Added: 0000143
2009-06-30 19:39	msbrown	Status	Under Review => Resolved
2009-06-30 19:39	msbrown	Resolution	Open => Accepted As Marked
2009-06-30 19:40	msbrown	Final Accepted Text	=> 0000110:0000143
2009-06-30 20:03	eblake	Note Added: 0000144
2009-07-01 16:44	~~Don Cragun~~	Name	Mark Brown => Eric Blake
2009-07-01 16:44	~~Don Cragun~~	Organization	IBM =>
2009-07-01 16:44	~~Don Cragun~~	Reporter	msbrown => eblake
2009-08-06 16:24	nick	Tag Attached: c99
2010-08-27 13:18	ajosey	Tag Attached: tc1-2008
2010-11-05 14:35	nick	Note Added: 0000607
2013-04-16 13:06	ajosey	Status	Resolved => Closed

View Issue Details

Activities

Issue History