0001020: snprintf: the description of the n argument conflicts with ISO C

ID	Project	Category	View Status	Date Submitted	Last Update

0001020	1003.1(2013)/Issue7+TC1	System Interfaces	public	2016-01-05 16:34	2024-06-11 08:58

Reporter	ch3root	Assigned To
Priority	normal	Severity	Objection	Type	Error
Status	Closed	Resolution	Accepted As Marked

Name	Alexander Cherepanov
Organization
User Reference
Section	fprintf() description of snprintf()
Page Number	900
Line Number	30166-30167
Interp Status	---
Final Accepted Text	0001020:0003482


Summary	0001020: snprintf: the description of the n argument conflicts with ISO C
Description	POSIX reads: "The snprintf() function shall be equivalent to sprintf(), with the addition of the n argument which states the size of the buffer referred to by s." This describes the n argument as the size of the buffer. But ISO C (both C99 and C11) doesn't have this restriction and describes n generally as a limit to the number of output characters written. This is an important distinction when the full output is to be written into the destination buffer and the limit on the number of characters is not to be triggered. For example, both snprintf calls below are wrong according to POSIX but are fine according to ISO C: char s[10]; snprintf(s, 20, "abc"); snprintf(s, SIZE_MAX, "%s", "abc"); The distinction is important for implementers because it, e.g., dictates how pointer arithmetic can be used. If n is the size of the buffer then s+n is valid to calculate and could be used as a limit in loops. But if n is not required to be an actual size then s+n is invalid according to ISO C and could wrap in practice. (This is evident for n=SIZE_MAX but it's true even for small values.) The distinction is important for users because the more limited (POSIX) definition, e.g., makes it impossible to uniformly handle cases where the size of the buffer is known and cases where an overflow is impossible for some other reason.
Desired Action	Replace "the size of the buffer referred to by s" with "the maximum number of written output characters" or something like that.
Tags	tc3-2008

jsm28 2016-01-05 17:49 reporter bugnote:0002999	http://austingroupbugs.net/view.php?id=761 was previously rejected. While I think the rejection was clearly on spurious grounds - that these cases are clearly defined by ISO C and the POSIX specification conflicts - it might be appropriate to raise an ISO C DR to get a WG14 view and make this a liaison issue.

shware_systems 2016-01-05 20:07 reporter bugnote:0003000	Both versions are essentially equivalent. The POSIX version simply makes clearer that if you use an n larger than the size allocated to s you run the risk of overwriting other allocations, but the implementation isn't required to do bounds checking to prevent overruns. The C version, while plausible in theory as an isolated operation for an arbitrary n value, glosses this possibility over more and both are reliable in practice only when s has the highest allocated base address of the system and the system has at least n writeable bytes installed above that address, for an arbitrary n <= SIZE_MAX. This lack of reliability has been noted and is addressed in C11 by the snprintf_s() interface, in Annex K, which requires the implementation to use min(strlen(result), sizeof s[], n) as the number of bytes to copy. sprintf_s() adds (strlen(result) > sizeof s[] && n >= sizeof s[]) as being a constraint violation and only a \0 is stored. Both implicitly leave how sizeof s[] is determined when s refers to a dynamic allocation as implementation-defined, and unspecified what to do when s=malloc(0) or otherwise points to a non-writable address.

ch3root 2016-01-05 22:35 reporter bugnote:0003006	In reply to 0003000: Not sure what you are trying to say. The situation seems to be quite simple to me. Sure, an implementation is not required to do bounds checking but in POSIX version it can use s+n in the code while in ISO C version it cannot. And from user's POV: ISO C version can be called as `char s[10]; snprintf(s, SIZE_MAX, "%s", "abc");` while POSIX version cannot. Reliability is a somewhat separate question.

shware_systems 2016-01-07 05:02 reporter bugnote:0003014	I believe the language of the C standard also allows s+bytes_to_copy to be used as a loop termination test value. As neither standard requires bounds checking the examples given are valid for both standards also, afaik. The point is more checking for possible errors or other terminating circumstance can't be deferred until the loop reaches that value; both standards expect the implementation to behave as if the checks happen for each char stored. Changing the parameter description to a size does not change this for POSIX. As POSIX supports memory locking and threading it has more tests for each loop iteration than is needed for C99, actually, that all have priority over the s+n check. For both standards portable applications also can not rely on the implementation checking the next char to be stored isn't past the range of memory allocated to s[] as one of those tests; that is left as an extension, historically, to permit speedier and smaller unsafe implementations. In C11, the Annex K interfaces still expect the checks per loop, but because they are required to respect the size of s the implementation can use s+internal_sizeof(s[]) when this is less than s+bytes_to_copy as the check value instead and this is portable. On POSIX the implementations of the new interfaces may even be faster due to various optimizations plausible when *s is temporarily write locked.

ch3root 2016-01-07 10:37 reporter bugnote:0003016	C11, 6.5.6p8, prohibits calculating P+N where P+N doesn't point into the same array as P. In practice, P+N can wrap when P is near the end of address space and N is greater than the size of the array that P points into. If P+N wrapped checks like "P+I < P+N" will give bogus results. Bounds checking (and Annex K etc.) are not relevant here.

Vincent Lefevre 2016-01-07 11:43 reporter bugnote:0003018	This seems to be an issue for optimizing compilers. Consider the following code: char s[10]; if (foo) snprintf(s, 20, "abc"); If the snprintf here is wrong according to POSIX, then a POSIX implementation could deduce that foo is 0, yielding a behavior that could be different from an ISO C implementation (where foo being 1 is valid).

shware_systems 2016-01-07 13:51 reporter bugnote:0003019 Last edited: 2016-01-07 14:14	6.5.6p8 leaves it undefined, not prohibited, per: "If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined." By the If clause there, the standard is saying any addition is possible, it is up to the application to ensure the result stays within the bounds of an array object declaration, not the compilers. The compiler's responsibility is to ensure that at least one char at the end of memory remains unallocated so an [n+1] address calculation won't wrap. When the application has done so then only the non-overflow guarantee applies because that last char is unallocated. It says nothing about whether p+n is required to abort a dereference attempt when p is inside the array and p+n isn't, as an lvalue or rvalue. It is up to the application, in this case the snprintf implementation, to provide separate code paths when P+N < P due to overflow; one to use when P+I > P, and the other P+I < P also.

~~user229~~ 2016-01-08 00:45 bugnote:0003020	I don't think the language in Annex K is actually intended to require implementations to use a "magically determine the real size of the array argument" operation, it's just using the same kind of sloppy language that POSIX does, saying "too big for the array pointed to by s" as if it were a synonym for "greater than n".

geoffclare 2016-11-03 15:14 manager bugnote:0003482	Change: with the addition of the n argument which states the size of the buffer referred to by s. to: with the addition of the n argument which limits the number of bytes written to the buffer referred to by s.

shware_systems 2016-11-07 09:21 reporter bugnote:0003483	Re: 3020 I agree the language is more vague on details than other parts of the standard, but I don't see it as particularly sloppy. The point of the '_s' interfaces is having versions that don't have common security holes like buffer overruns, so when it says 'fit within' or 'too big' to me the intent is the implementations >>shall do what's necessary, documented or not<< to prevent those overruns. It seems there it's left open 'how' to work that magic so imps, at their discretion, can use their existing code to get it done or new methods. Either way could reference the same data in the memory manager free(void *) has to, to get the value used with malloc() or realloc(), or calculate a usable size if block chaining/coalescing used. This applies whether the malloc() calls happen at runtime or when creating object descriptions for an object file at compile time, for auto or static declarations. On POSIX systems there's the complexity of supporting dl_sym() too, in some fashion, for externs exported by libraries, but code exists for that also. The vague aspect I see is whether the sizes are 'as declared' or 'as allocated to satisfy alignment practices', not that sizing involves any magic.

Date Modified	Username	Field	Change
2016-01-05 16:34	ch3root	New Issue
2016-01-05 16:34	ch3root	Name	=> Alexander Cherepanov
2016-01-05 16:34	ch3root	Section	=> snprintf
2016-01-05 16:34	ch3root	Page Number	=> (page or range of pages)
2016-01-05 16:34	ch3root	Line Number	=> (Line or range of lines)
2016-01-05 17:49	jsm28	Note Added: 0002999
2016-01-05 20:07	shware_systems	Note Added: 0003000
2016-01-05 20:18	eblake	Relationship added	related to 0000761
2016-01-05 22:35	ch3root	Note Added: 0003006
2016-01-07 05:02	shware_systems	Note Added: 0003014
2016-01-07 10:37	ch3root	Note Added: 0003016
2016-01-07 11:43	Vincent Lefevre	Note Added: 0003018
2016-01-07 13:51	shware_systems	Note Added: 0003019
2016-01-07 14:14	shware_systems	Note Edited: 0003019
2016-01-08 00:45	~~user229~~	Note Added: 0003020
2016-01-08 02:52	Don Cragun	Section	snprintf => fprintf() description of snprintf()
2016-01-08 02:52	Don Cragun	Page Number	(page or range of pages) => 900
2016-01-08 02:52	Don Cragun	Line Number	(Line or range of lines) => 30166-30167
2016-01-08 02:52	Don Cragun	Interp Status	=> ---
2016-11-03 15:14	geoffclare	Note Added: 0003482
2016-11-03 15:15	geoffclare	Final Accepted Text	=> 0001020:0003482
2016-11-03 15:15	geoffclare	Status	New => Resolved
2016-11-03 15:15	geoffclare	Resolution	Open => Accepted As Marked
2016-11-03 15:15	geoffclare	Tag Attached: tc3-2008
2016-11-07 09:21	shware_systems	Note Added: 0003483
2019-10-21 14:06	geoffclare	Status	Resolved => Applied
2024-06-11 08:58	agadmin	Status	Applied => Closed

View Issue Details

Relationships

Activities

Issue History