Austin Group Defect Tracker

Aardvark Mark IV


Viewing Issue Simple Details Jump to Notes ] Issue History ] Print ]
ID Category Severity Type Date Submitted Last Update
0001019 [1003.1(2013)/Issue7+TC1] System Interfaces Editorial Clarification Requested 2016-01-05 14:46 2019-10-21 14:04
Reporter ch3root View Status public  
Assigned To
Priority normal Resolution Accepted As Marked  
Status Applied  
Name Alexander Cherepanov
Organization
User Reference
Section strndup
Page Number 2012
Line Number 64214-64220
Interp Status ---
Final Accepted Text Note: 0003471
Summary 0001019: strndup shouldn't require source array to be null-terminated
Description The description of the strndup function in POSIX talks about the array argument s as a C string. E.g., it's assumed to have length: "If the length of s is larger than size, only size bytes shall be duplicated."
This poses two problems:

1) it's impossible to use strndup to duplicate a non-null-terminated array of a given size;

2) it gives to implementations freedom to examine more than size bytes of the array pointed to by s which could complicate a situation in multithreading program or when restricted pointers are involved.

A quite natural way to implement strndup is by using strnlen. Then those problems don't arise. This is the case for at least glibc[1], musl[2] and OpenBSD[3]/FreeBSD[4]. NetBSD[5] is similar.

Such a treatment of its argument causes strndup to differ from other strn* functions which don't have these problem. Take for example the strncat function (which copies one of its arguments too): "The strncat() function shall append not more than n bytes (a NUL character and bytes that follow it are not appended) from the array pointed to by s2 to the end of the string pointed to by s1." While the destination s1 is described as a string the source s2 is only described as an array. And there are no mentions of the length of s2. Rightly so.

[1] https://sourceware.org/git/?p=glibc.git;a=blob;f=string/strndup.c;h=51a1969146df7b911e761706f7dece6dc149c7dd;hb=HEAD [^]
[2] http://git.musl-libc.org/cgit/musl/tree/src/string/strndup.c [^]
[3] http://cvsweb.openbsd.org/cgi-bin/cvsweb/src/lib/libc/string/strndup.c?rev=1.2&content-type=text/x-cvsweb-markup [^]
[4] https://svnweb.freebsd.org/base/head/lib/libc/string/strndup.c?revision=287181&view=markup [^]
[5] http://cvsweb.netbsd.org/bsdweb.cgi/src/lib/libc/string/strndup.c?rev=1.4&content-type=text/x-cvsweb-markup&only_with_tag=MAIN [^]
Desired Action Formulate the description of strndup in terms of arrays instead of strings to make it similar to other strn* function. Make it clear that the strndup function shall never examine more than maxlen bytes of the array pointed to by s (like strnlen).
Tags tc3-2008
Attached Files

- Relationships
has duplicate 0001397Closedajosey 1003.1(2008)/Issue 7 strndup incorrectly implies argument must be a string 

-  Notes
(0003002)
Don Cragun (manager)
2016-01-05 20:14

Am I missing something? The description of the problem seems to apply to strdup(), not strndup().
(0003003)
ch3root (reporter)
2016-01-05 20:28

Not sure what you mean. strdup() doesn't have an additional argument for the size so have to find the end of the string an the traditional way -- by searching for NUL. It cannot work with a string which is not null-terminated.

OTOH strndup() is additionally provided with the size. So, it should only examine the first 'size' bytes and if NUL is not found their, strndup() should stop right away and not go further searching for NUL which it doesn't need.

Hope this helps.
(0003004)
Don Cragun (manager)
2016-01-05 21:05

The description of strndup() is:
The strndup( ) function shall be equivalent to the strdup( ) function, duplicating the provided s in
a new block of memory allocated as if by using malloc( ), with the exception being that strndup( )
copies at most size plus one bytes into the newly allocated memory, terminating the new string
with a NUL character. If the length of s is larger than size, only size bytes shall be duplicated. If
size is larger than the length of s, all bytes in s shall be copied into the new memory buffer,
including the terminating NUL character. The newly created string shall always be properly
terminated.

The text marked in bold in that quote makes it clear to me that strndup() can't use strcpy(), if that is what you are implying. To me, the description says it copies to a NUL byte or size bytes, whichever comes first.
(0003005)
ch3root (reporter)
2016-01-05 22:10

Right. The question is how it can be implemented in strndup(). Straightforward way is to compute the number of characters to copy:

  min(strlen(s), size)

(where min has the evident definition).

Does POSIX permit such an implementation? I argue that yes, POSIX permits it. It's almost a literal translation from the description (the text marked in bold in your quote and the next sentence).

But this is bad because such an implementation cannot be used with the following code:

  char s[10];
  memset(s, 'A', sizeof s);
  char *p = strndup(s, sizeof s);

Applying strlen to s is undefined in this case because it's not null-terminated.

Real life implementation are more elaborate and will work fine with this example. It would be nice for POSIX to mandate such implementations. This will align strndup with other strn* function.
(0003007)
shware_systems (reporter)
2016-01-05 23:31

Because strlen() does not have the early exit provisions of strnlen() its use by implementations in strndup() is precluded. Whether an implementation calls strnlen() directly or inlines the functionality for speed is up to it. The ambiguity of whether size+1 bytes or strnlen()+1 bytes if this is smaller are attempted to be allocated is also up to the implementation. IIRC, the corner case of an alloc of size succeeding where use of size+1, to add a \0, might return ENOMEM was discounted as too unlikely to warrant complicating the interface further. The net effect is if the dup does succeed both strnlen() and strlen() are guaranteed to be usable with the copy.
(0003008)
ch3root (reporter)
2016-01-06 09:47

To make it clear: I talked about applying strlen to s inside a strndup implementation, not in user code.

In reply to ~3007: why do you say that the use of strlen() by implementations in strndup() is precluded? Is it against the restrictions posed by POSIX? No. Is it ineffective in this case? Probably but it's something for an implementer to decide.

Or put it this way: users should have a solid base to rely upon. They don't have to think like "Well, this is not guaranteed by POSIX but implementers would be crazy not to provide it, so yes, I will use it in my program (and pray)".

Roughly speaking, either POSIX guarantees that strndup will not call strlen or it doesn't.
(0003009)
kre (reporter)
2016-01-06 12:37

Don, shareware_systems, you are reading the text like an implementor,
asking whether the text says anything that prevents you from implementing
strndup() in the way you know it should be implemented. And you find it
does not, so it all looks OK.

ch3root is however reading the text like a lawyer, looking to see if there
is anything there which would allow some weasel to do an obviously bogus
implementation, and yet claim to be conformant.

In the text emphasised by Don in note 3004, the standard says "If the size of s"
which implies that the size of s can be computed, or at least a reasonable
argument can be made that it implies that, and so allows an implementation to
go ahead and compute that (or attempt to) - perhaps crashing with a mem fault
inside the library routine. We all know that would be insane, and that no-one
with even 1/10th of a brain would do that, but we're talking about what some
weasel might do, not a rational sane implementor.

All ch3root is asking for is to alter the way the text is written, to make
it clear that attempting to calculate the size of s is not permitted, by
having it say something like

     ... copies up to a nul byte, but at most size bytes, to new memory
     allocated as if by malloc() ...

This is a simple change, and is one of the few truly editorial changes I
have seen here among those classified this way. I see no reason anyone would
object to this.

kre
(0003012)
shware_systems (reporter)
2016-01-06 15:35

So did I. The standard guarantees strndup() will not call strlen implicitly by the requirements, for most platforms. This is a case where it does go without saying and the standard tries to avoid saying particular interfaces shall or shall not be used directly in implementing other interfaces. It leaves it as 'the effects shall be the same as xxx()', or 'the result may be used like the result of xxx()', or 'the implementation shall behave as if no other calls to xxx() are used' in normative sections.

For this case as an explicit n is specified that is also an assertion that trying to read past that length in the source string may fail so can't be attempted safely, not just the size of the alloc to be attempted. An implementation that occasionally fails where the standard has it should always succeed is not considered conforming. By XBD 2 that is extension behavior, at best, as my understanding. Any use of an interface that is known may be the source of occasional failures in circumstances like this is precluded as a corollary, so it is against the restrictions. Whether the conformance test suite checks for this particular possibility properly I couldn't say.

That being said, reexpressing the sentence Don highlighted as:
"If a null byte is not encountered in examining the first size bytes of the source data the interface shall then copy the bytes examined plus a terminating null byte to the allocated memory block."

makes it more explicit that it is the implementation's responsibility to stop examining bytes when a terminating condition exists. The null byte gets added so the copy properly represents a C string in subsequent uses, whether the source was also a string or discovered to be a char[size] array.
(0003015)
ch3root (reporter)
2016-01-07 10:25

POSIX clearly talks about "the length of s" (see the sentence Don highlighted). Hence implementations are free to compute "the length of s" and users are prohibited from calling strndup with such s that have "the length of s" undefined. Everything is very explicit right now.

It would be possible to argue about something implicit if there were some specific language to that effect, like "but s is not required to be null-terminated". If there is no such language I don't see any grounds for implicit conclusions.
(0003471)
geoffclare (manager)
2016-10-27 16:15
edited on: 2016-10-27 16:16

Change:
The strndup() function shall be equivalent to the strdup() function, duplicating the provided s in a new block of memory allocated as if by using malloc(), with the exception being that strndup() copies at most size plus one bytes into the newly allocated memory, terminating the new string with a NUL character. If the length of s is larger than size, only size bytes shall be duplicated. If size is larger than the length of s, all bytes in s shall be copied into the new memory buffer, including the terminating NUL character.
to:
The strndup() function shall be equivalent to the strdup() function, duplicating the provided s in a new block of memory allocated as if by using malloc(), with the exception being that strndup() copies at most size bytes from the array s into the newly allocated memory, terminating the new string with a null byte. If s contains a null terminator within the first size bytes, all bytes in s up to and including the null terminator shall be copied into the new memory buffer. The strndup() function shall not examine more than size bytes of the array pointed to by s.



- Issue History
Date Modified Username Field Change
2016-01-05 14:46 ch3root New Issue
2016-01-05 14:46 ch3root Name => Alexander Cherepanov
2016-01-05 14:46 ch3root Section => strndup
2016-01-05 14:46 ch3root Page Number => unknown
2016-01-05 14:46 ch3root Line Number => unknown
2016-01-05 20:06 Don Cragun Page Number unknown => 2012
2016-01-05 20:06 Don Cragun Line Number unknown => 64214-64220
2016-01-05 20:06 Don Cragun Interp Status => ---
2016-01-05 20:14 Don Cragun Note Added: 0003002
2016-01-05 20:28 ch3root Note Added: 0003003
2016-01-05 21:05 Don Cragun Note Added: 0003004
2016-01-05 22:10 ch3root Note Added: 0003005
2016-01-05 23:31 shware_systems Note Added: 0003007
2016-01-06 09:47 ch3root Note Added: 0003008
2016-01-06 12:37 kre Note Added: 0003009
2016-01-06 15:35 shware_systems Note Added: 0003012
2016-01-07 10:25 ch3root Note Added: 0003015
2016-01-07 20:40 Florian Weimer Issue Monitored: Florian Weimer
2016-10-27 16:15 geoffclare Note Added: 0003471
2016-10-27 16:16 geoffclare Note Edited: 0003471
2016-10-27 16:17 geoffclare Final Accepted Text => Note: 0003471
2016-10-27 16:17 geoffclare Status New => Resolved
2016-10-27 16:17 geoffclare Resolution Open => Accepted As Marked
2016-10-27 16:17 geoffclare Tag Attached: tc3-2008
2019-10-21 14:04 geoffclare Status Resolved => Applied
2020-09-01 08:31 geoffclare Relationship added has duplicate 0001397


Mantis 1.1.6[^]
Copyright © 2000 - 2008 Mantis Group
Powered by Mantis Bugtracker