Austin Group Defect Tracker

Aardvark Mark IV


Viewing Issue Simple Details Jump to Notes ] Issue History ] Print ]
ID Category Severity Type Date Submitted Last Update
0000761 [1003.1(2013)/Issue7+TC1] System Interfaces Editorial Clarification Requested 2013-10-08 23:17 2022-07-18 15:43
Reporter dalias View Status public  
Assigned To
Priority normal Resolution Rejected  
Status Closed  
Name Rich Felker
Organization musl libc
User Reference
Section snprintf
Page Number 906
Line Number 30447
Interp Status ---
Final Accepted Text
Summary 0000761: Requirement of error for snprintf with n>INT_MAX may conflict with ISO C
Description ISO C (1999 or 2011 version) reads:

"The snprintf function is equivalent to fprintf, except that the output is written into an array (specified by argument s) rather than to a stream. If n is zero, nothing is written, and s may be a null pointer. Otherwise, output characters beyond the n-1st are discarded rather than being written to the array, and a null character is written at the end of the characters actually written into the array."

My reading of this text is that, assuming INT_MAX<SIZE_MAX, n=(size_t)INT_MAX+(size_t)1 is a valid argument with well-defined behavior already. Thus, the POSIX requirement that n>INT_MAX result in failure with EOVERFLOW seems to conflict with the ISO C requirements on the function.

Returning an error when the actual number of characters to be written would overflow INT_MAX (which POSIX also requires) seems reasonable, since ISO C requires an impossible action ("The snprintf function returns the number of characters that would have been written had n been sufficiently large", impossible since this number cannot be represented in an object of type int) for that case.
Desired Action It's not entirely clear to me what should be done here. While, on the surface, the POSIX text seems to conflict with ISO C in this matter, the ISO C specification of snprintf is sufficiently poor (e.g. the above example of requiring an impossible return value) that it may actually be the latter which needs revision.

Still, I feel like it is unfortunate, and potentially a source of application bugs, for snprintf to throw an error just because the provided buffer is "too big".
Tags C11
Attached Files

- Relationships
related to 0001020Applied 1003.1(2013)/Issue7+TC1 snprintf: the description of the n argument conflicts with ISO C 
related to 0001219Appliedajosey 1003.1(2008)/Issue 7 snprintf reequirement to fail when n > INT_MAX conflicts with C 

-  Notes
(0001864)
Don Cragun (manager)
2013-10-09 01:15

The standard already clearly marks both EOVERFLOW errors as extensions to the C Standard's requirements.

Given that an application calling snprintf() with a value of n > INT_MAX is asking the implementation to (possibly) store more than INT_MAX bytes into a buffer and then to return the number of bytes stored (which it cannot do with a return value of type int), it seems perfectly reasonable to me to do some up-front error checking and report that the given buffer size is invalid. If snprintf() were allowed to stop conversions early so it could produce an incomplete result that would allow a return value <= INT_MAX, it would be different. It looks to me like it is a programming error to pass a buffer to snprintf() that is larger than INT_MAX. Therefore, I believe the error should remain as a SHALL FAIL error. At the very least, n > INT_MAX must remain as a MAY FAIL error and return value > INT_MAX must remain as a SHALL FAIL error.
(0001865)
dalias (reporter)
2013-10-09 02:18

An extension to the C standard's requirements cannot conflict with them, though. To actually be an extension, it must merely define behavior in cases that are left undefined or implementation-defined by the C standard. This does not seem to be such a case.

As for the second paragraph of note 1864, I don't see calling snprintf with n>INT_MAX as "asking" the implementation to possibly store more characters. From the caller's perspective, n is simply the size of the buffer, and having to check that the buffer is not gratuitously large before passing it to snprintf places an unnecessary and inconvenient burden on the application: if you're working with a caller-provided buffer and size, then instead of:

result = snprintf(buf, size, ...);

you must do something like:

if (size > INT_MAX) size = INT_MAX;
result = snprintf(buf, size, ...);

It's unlikely that existing applications make this check (i.e. it's likely that they have bugs caused by this issue), so removing this error case from the standard would possibly improve the behavior of existing applications under rare corner cases.

(Also note that, per the report in glibc bug #14771, glibc is presently not generating this error. I have not however read the relevant glibc source or conducted tests to confirm that this bug report is correct.)
(0001866)
wlerch (reporter)
2013-10-09 02:29

Consider

  snprintf( buf, SIZE_MAX, "foo" );

Is this not asking the implementation to store precisely four bytes in buf, and return 3?
(0001867)
shware_systems (reporter)
2013-10-09 04:31
edited on: 2013-10-09 04:49

I agree it is sufficiently poor that it should be referred to the C
standard.

From the C99 standard it has:
7.19.6.5 The snprintf function
3 The snprintf function returns the number of characters that would
have been written had n been sufficiently large, not counting the
terminating null character, or a negative value if an encoding error
occurred. Thus, the null-terminated output has been completely written
if and only if the returned value is nonnegative and less than n.

The POSIX text uses the same wording for successful return and
similar for the n equals 0 case text (line 30426 and 30431). While the
intent is fairly obvious it is not specific enough given the type mismatch
of the n argument and the return value.

This should be more like:
3 If an encoding error occurs the snprintf function shall return a
negative number indicating this. What characters, if any, are written
to the destination buffer is undefined.

4 Otherwise, upon successful completion the snprintf function returns
the number of characters, less than n, that have been placed in the
destination buffer.

5 If n is 0 or more than n-1 characters are required the function shall
place the leading n characters in the buffer and return the number
greater than n of the buffer size needed for successful completion.

6 If this positive return value is larger than INT_MAX, a negative
number shall be returned indicating the return value is not
representable.

7 Thus, the null-terminated output is known to be completely written if
and only if the returned value is nonnegative and less than n. When the
return value is not representable it is not guaranteed that a
terminating NUL character has been written to the buffer.

This allows n to be anything up to SIZE_MAX. If only 1 or 2 characters
are actually written this is comfortably below INT_MAX and the function
should succeed, as note 1866 suggests, so an n greater than INT_MAX
on input I'd consider as a 'may fail' indicator than 'shall fail' automatically.
If the function attempts to write to non-existent memory that's a SIGBUS
or memory protection event; not the C standards problem. It's when n is
greater than INT_MAX and the function tries to write past INT_MAX-1
characters that is the problem and then for POSIX it's a 'shall fail' with
EOVERFLOW. Separate text indicating this is a source for EOVERFLOW
I'd still keep under Error Codes.

The C standard could change the type of n to int also.

(0001868)
shware_systems (reporter)
2013-10-09 06:28

Re: #1864
It's a reasonable check on the surface, yes, but with size_t long long because of a 64-bit address space, it is possible to write more than 2 gigabytes in a single call with int as a long, with enough RAM or page file space. Because of that the check should be done by the application, it looks, not the interface. When n is larger than INT_MAX I think it would be more reasonable to use that as a trigger to do write probes as part of the interface, even if it slows things down, and return an error because of that instead of risk a SIGBUS event, but that would be a quality of implementation extension.

The prototype implies size_t shall use no more bytes than int, and should probably have a separate size_t *count argument to hold the result because this isn't guaranteed. The int return value could be encoding error <0, 0 for EOVERFLOW, and >0 if count does fit in an int anyways. I expect the return was left as int to be compatible with the return of the other interfaces, and size_t might be a typedef of an unsigned integer. Perhaps something like, as a CX extension like dprintf,
"int scprintf(char *restrict s, size_t n, size_t *restrict count, const char *restrict format, ...);"
would be useful, even if invention, to address this.
(0001930)
Don Cragun (manager)
2013-10-17 16:36
edited on: 2013-10-17 16:50

After a lengthy discussion during the October 17, 2013 conference call, we decided that we agree with the sentiment stated in Note: 0001864.

The C standard does not specify the behavior when n > INT_MAX, so the current specified behavior does not conflict with the C standard.

Although it is possible that snprintf() could be given a huge buffer and use sizeof(buffer) to set the value of n, we believe that (other than for demonstration purposes) this is unlikely to occur. We do not believe the standard needs to be changed for this corner case.

Therefore, this bug is rejected.

(0002615)
dalias (reporter)
2015-04-09 19:32

Could this possibly be reopened? I've been shown a new example where the issue arises in practice, and it's a very ugly corner case. Consider an implementation of asprintf (or a similar function) as:

int vasprintf(char **p, const char *fmt, va_list ap)
{
    va_list vp;
    va_copy(vp, ap);
    int l = vsnprintf(0, 0, fmt, vp);
    va_end(vp);
    if (l < 0) return -1;
    char *s = malloc((size_t)l+1);
    if (!s) return -1;
    return vsnprintf(*p = s, (size_t)l+1, fmt, ap);
}

One possibility is that vsnprintf returns exactly INT_MAX, in which case the output buffer has to be INT_MAX+1 bytes in length, triggering a spurious error from the second vsnprintf call despite the fact that the result fits and the return value for success (INT_MAX) is representable in type int.

It could be argued that the above code should be checking the return value of the second vsnprintf call, but I suspect a large volume of code is assuming that the second call will succeed if the first did, and very bad things could happen (silently using uninitialized memory instead of the result) when the second call gratuitously fails.

I also disagree with the claim in note 0001930 that ISO C does not specify a behavior fr n>INT_MAX. The standard as written specifies the behavior for any value of n representable in size_t as long as the pointed-to buffer object is at least n bytes in size. I can see no justification for a claim that the specification does not apply to values of n larger than INT_MAX.
(0002616)
Don Cragun (manager)
2015-04-09 20:18
edited on: 2015-04-09 20:21

In reference to Note: 0002615: Can you please explain what ISO C specifies as the return value from the call:
snprintf(b,(size_t)INT_MAX+10,"Test: %*.*s\n",INT_MAX-1,INT_MAX-1,"");

which by my count should be INT_MAX + 6 returned in an object of type int.

(0002617)
wlerch (reporter)
2015-04-09 20:39

Don: there's no difference between what ISO C says about the return value your call and from this:

snprintf(NULL, 0, "Test: %*.*s\n",INT_MAX-1,INT_MAX-1,"");

In both cases the requirement is impossible, demonstrating a defect in ISO C. The defect is independent of whether n is greater than INT_MAX or not.

On the other hand, ISO C is perfectly clear about the call from my earlier example:

snprintf( buf, SIZE_MAX, "foo" );

Assuming buf has room for at least four characters, the call is required to return 3. But POSIX wants it to fail. That's a conflict.
(0002618)
dalias (reporter)
2015-04-10 00:59

Don, I agree that ISO C does not adequately specify the behavior when the return value would overflow int. Thus I have no objection to the requirement (added by POSIX) that snprintf report an error in this case. What I object to, and consider a conflict with ISO C, is the requirement to fail and report an error in the case where the argument n is large but the return value does not overflow. I believe the language in ISO C specifies what happens in this case, and the specified result is success (non-negative return value).
(0003010)
ch3root (reporter)
2016-01-06 15:16

An additional data point. I've looked through mand pages in several BSDs:

OpenBSD[1] has no mentions of EOVERFLOW;
FreeBSD[2] documents EOVERFLOW for the case size > INT_MAX + 1;
NetBSD[3] documents EOVERFLOW for the case size > INT_MAX.

AFAICT their implementations agree with the man pages. That is, OpenBSD and FreeBSD don't conform with the POSIX requirement discussed here. And NetBSD seems to conform due to misunderstanding of the FreeBSD code (according to the commit message in [4]) :-)

[1] http://www.openbsd.org/cgi-bin/man.cgi/OpenBSD-current/man3/snprintf.3 [^]
[2] https://www.freebsd.org/cgi/man.cgi?query=snprintf&sektion=3 [^]
[3] http://netbsd.gw.com/cgi-bin/man-cgi?snprintf++NetBSD-current [^]
[4] http://cvsweb.netbsd.org/bsdweb.cgi/src/lib/libc/stdio/vsnprintf.c?rev=1.28&content-type=text/x-cvsweb-markup [^]
(0003013)
ch3root (reporter)
2016-01-06 20:36

There are 3 intermixed issues here.

1. The value of the limit. The details depend on the goal of the n>INT_MAX check. Is there a way to find out exactly why this check was introduced in POSIX?

A plausible reason is to catch early those cases where snprintf could potentially be required to return a number which it cannot return. Then there is an off-by-one error here: n includes the terminating null but the return value doesn't. Thus, the intended check should be n>(size_t)INT_MAX+1. This is what FreeBSD does.

Another possible reason is to catch cases where a user passes a negative int to snprintf. This indeed requires the check to be "n>INT_MAX" on platforms where int and size_t have the same size. But then you run into a problem (described in http://austingroupbugs.net/view.php?id=761#c2615) [^] that you cannot pass result from one snprintf to another snprintf.

2. The idea of early exits itself, i.e. the case when n is big but the the amount of characters actually written is small. Like this:

  char s[10];
  snprintf(s, SIZE_MAX, "abc");


It's fine in ISO C but not in POSIX. Perhaps the conflict in the description of the n argument that I recently reported in http://austingroupbugs.net/view.php?id=1020 [^] contributed here. According to the current text of POSIX the example above is invalid because SIZE_MAX is not the size of the buffer. But there is no such a requirement in ISO C. I can imagine that the idea behind the n>INT_MAX check was that, apart from catching accidentally bogus values of n, the check will only affect callers with very large buffers which are rare. (And this is stated in http://austingroupbugs.net/view.php?id=761#c1930 [^] .) Even in this case the restriction would conflict with ISO C but given that n is not required to be the actual size of the buffer the scope of the problem is not limited to large buffers.

Also, if a user is not supposed to pass values greater than INT_MAX why the argument is of type size_t? There could be various reasons for this but, at first sight, it hints that values greater than INT_MAX are welcome here.

3. Big return values. What should snprintf do when the value to be returned is greater than INT_MAX? This is not specific to snprintf, plain printf has the same problem. And POSIX has _another_ clause dealing with this case, which is applicable to all forms of fprintf. AIUI this bug report is about a separate clause, specific to snprintf. Thus, all arguments about impossible-to-return results of snprintf are not relevant to this bug report.
(0003021)
ch3root (reporter)
2016-01-10 19:46

1. Further research has shown that the behavior when the return value would overflow int was clarified by WG14 in C99 by adding it into the list of undefined behaviors in Annex J. It was updated in C11 to the following text:

"J.2 Undefined behavior
The behavior is undefined in the following circumstances:
[skip]
— The number of characters or wide characters transmitted by a formatted output function (or written to an array, or that would have been written to an array) is greater than INT_MAX (7.21.6.1, 7.29.2.1)."

Please note that this description does _not_ mention the size argument of snprintf or the size of the buffer.

2. The same problem affects the swprintf function. (With a twist -- see http://austingroupbugs.net/view.php?id=1021 [^] .)

- Issue History
Date Modified Username Field Change
2013-10-08 23:17 dalias New Issue
2013-10-08 23:17 dalias Name => Rich Felker
2013-10-08 23:17 dalias Organization => musl libc
2013-10-08 23:17 dalias Section => snprintf
2013-10-08 23:17 dalias Page Number => unknown
2013-10-08 23:17 dalias Line Number => unknown
2013-10-09 01:15 Don Cragun Page Number unknown => 906
2013-10-09 01:15 Don Cragun Line Number unknown => 30447
2013-10-09 01:15 Don Cragun Interp Status => ---
2013-10-09 01:15 Don Cragun Note Added: 0001864
2013-10-09 02:18 dalias Note Added: 0001865
2013-10-09 02:29 wlerch Note Added: 0001866
2013-10-09 04:31 shware_systems Note Added: 0001867
2013-10-09 04:49 shware_systems Note Edited: 0001867
2013-10-09 06:28 shware_systems Note Added: 0001868
2013-10-09 06:54 Florian Weimer Issue Monitored: Florian Weimer
2013-10-17 16:01 nick Tag Attached: C11
2013-10-17 16:36 Don Cragun Note Added: 0001930
2013-10-17 16:36 Don Cragun Status New => Resolved
2013-10-17 16:36 Don Cragun Resolution Open => Rejected
2013-10-17 16:38 Don Cragun Note Edited: 0001930
2013-10-17 16:50 Don Cragun Note Edited: 0001930
2015-04-09 19:32 dalias Note Added: 0002615
2015-04-09 20:18 Don Cragun Note Added: 0002616
2015-04-09 20:19 Don Cragun Note Edited: 0002616
2015-04-09 20:20 Don Cragun Note Edited: 0002616
2015-04-09 20:21 Don Cragun Note Edited: 0002616
2015-04-09 20:39 wlerch Note Added: 0002617
2015-04-10 00:59 dalias Note Added: 0002618
2015-05-07 08:23 dancol Issue Monitored: dancol
2016-01-05 18:17 ch3root Issue Monitored: ch3root
2016-01-05 20:18 eblake Relationship added related to 0001020
2016-01-06 15:16 ch3root Note Added: 0003010
2016-01-06 20:36 ch3root Note Added: 0003013
2016-01-10 19:46 ch3root Note Added: 0003021
2018-12-12 21:06 eblake Relationship added related to 0001219
2022-07-18 15:43 Don Cragun Status Resolved => Closed


Mantis 1.1.6[^]
Copyright © 2000 - 2008 Mantis Group
Powered by Mantis Bugtracker