Austin Group Defect Tracker

Aardvark Mark IV

Viewing Issue Simple Details Jump to Notes ] Issue History ] Print ]
ID Category Severity Type Date Submitted Last Update
0000986 [1003.1(2008)/Issue 7] Base Definitions and Headers Editorial Clarification Requested 2015-09-23 08:35 2016-07-07 23:18
Reporter EdSchouten View Status public  
Assigned To ajosey
Priority normal Resolution Open  
Status Under Review  
Name Ed Schouten
Organization Nuxi
User Reference
Section <string.h> and <wchar.h>
Page Number n/a
Line Number n/a
Interp Status ---
Final Accepted Text
Summary 0000986: Would it be worth investigating adding strlcpy(), strlcat(), wcslcpy() and wcslcat()?
Description Back in 1998, OpenBSD 2.4 added the functions strlcpy() and strlcat(). These functions eventually made it into a whole bunch of other operating systems. By now, at least OpenBSD, FreeBSD, NetBSD, Solaris, Mac OS X, and QNX provide these functions. Implementations are also present in popular Open Source projects like SDL, GLib, ffmpeg and rsync. The Linux kernel also uses them internally.

These functions have already been in use for the last 17 years and people seem to like them, which is why I'd like to propose that we add them in the next version of POSIX.

It is important to mention that these functions do not come without any form of criticism. One important concern about these functions is that they may silently truncate the string if the output buffer is too small. My response to this would be the following:

1. Checking the return value allows you to detect truncation.

2. This is not different from other existing functions such as snprintf(), which seems to be preferred over sprintf() nowadays.

3. Even though string truncation is bad and could potentially lead to security vulnerabilities, it is important to take into account what the impact would have been if strcpy() and strcat() were used instead. At least strlcpy() and strlcat() would prevent the buffer overflow from happening, making it less likely that the integrity of the control flow of the program is affected.
Desired Action Please add them. :-)
Tags No tags attached.
Attached Files

- Relationships

-  Notes
eblake (manager)
2015-09-23 14:00

Ulrich Drepper is no longer as active on developing either POSIX or glibc, but his scathing analysis of these interfaces is still true: the only correct way to use them requires MORE boilerplate than what you could do by using other existing interfaces: [^]

Now, strlcat does effectively do this check, if the programmer remembers to check the result - so you can use it safely:

if (strlcat(dest, source, dest_bufferlen) >= dest_bufferlen)
    /* Bug out */

Ulrich's point is that since you have to have destlen and sourcelen around (or recalculate them, which is what strlcat effectively does), you might as well just use the more efficient memcpy anyway:

if (destlen + sourcelen > dest_maxlen)
    goto error_out;
memcpy(dest + destlen, source, sourcelen + 1);
destlen += sourcelen;

As such, I'm not personally in favor of standardizing them, but am happy to let implementations continue to provide them as extensions.
EdSchouten (updater)
2015-09-23 16:19

Hi Eric,

Thanks for the quick response!

I agree. strlcat() does suffer from the issue that it allows people to write less efficient code more easily. Once you start to call strlcat() in a loop, you could create a piece of code that runs in quadratic time where linear time is possible.

That said, I think it's important that we look at this problem from a different point of view. Let's put the ways you can copy/concatenate strings on an axis from 'bad' to 'good', based on safety, efficiency, etc.:

strcpy()/strcat() <-----> strlcpy()/strlcat() <-----> strlen()+memcpy()

Now the question becomes: if we don't think it's a good idea to add strlcpy()/strlcat() because strlen()+memcpy() are more efficient, why should we still standardize strcpy()/strcat()? As far as I know, POSIX is not a superset of the C standard[1], so there is nothing that prevents us from deprecating these functions. But my guess is that people would get quite upset if there's no standard function to copy strings.

What I try to say is that we should think of strlcpy()/strlcat() as an incremental improvement over strcpy()/strcat(). It has never even tried to address the running time problem of C string concatenation. They have been designed so that existing code can be refactored to use these functions pretty easily, and practice has shown that they are good at this. Even if they end up truncating strings, it's still better than losing integrity of control flow due to a buffer overflow.

If we don't think that strlcpy()/strlcat() are the right solution to the problem, would it make sense to rephrase this bug: can we come up with a set of functions that do tackle this problem the right way?

[1] ctime() is deprecated in issue 7, making me assume it's going to be removed from issue 8 (?). It's still part of C11.
shware_systems (reporter)
2015-10-01 05:01

Re: 2844, Note [1]
If Issue 8 defers to C11, ctime() will probably get the new LEGACY mark, not removed, or stay marked OBSOLETE. It becomes a candidate for removal in the issue following the C standard removing it, afaik. Couldn't say whether it's on the agenda now for SC22.
joerg (reporter)
2015-10-09 14:00

I see no verification for the claim that strlen() + memcpy() is faster than strlcpy().

strlcpy() is the most useful method to avoid buffer overruns and to detect truncation (in case you check the return value).

strlcpy() is much faster than strncpy().

strlcpy() allows you to use the result to compute the amount of space needed for the target in an efficient way.

The strl*() type functions are available on *BSD, OSX, Solaris since late 1998 (*BSD) and early 1999 (Solaris). I believe that this is a verification that many people believe they are a useful addition.

Mr. Drepper is a single person that does not seem to discuss his claims with others. I cannot see why the claims for strl*() from Mr. Drepper should include a valid point.
steffen (reporter)
2015-11-09 20:57

Since this is a new interface for the standard i think it would be worth having a look at strscpy() that the Linux community now seems to advocate for future development; it effectively is

   strscpy(char *dst, char const *src, size_t dstsize){
      ssize_t rv;

      if(LIKELY(dstsize > 0)){
         rv = 0;
            if((dst[rv] = src[rv]) == '\0')
               goto jleave;
         }while(--dstsize > 0);
         dst[--rv] = '\0';
      rv = -E2BIG;
      return rv;

Nice properties: no zero padding, dst always terminated if dstsize greater 0 - what strncpy() should have been from the beginning in my not so humble opinion.
-E2BIG is possibly a bit strange in user space, -1 is a common error value in POSIX, or -dstsize for the very conservative under us (with the potential to save a temporary / not wasting a possible return register).

A strscat() in equal spirit does not yet exist, but is thinkable.
joerg (reporter)
2015-11-10 11:33

This strscpy() proposal looks less usable than strlcpy() as it does not return the needed size for a copy operation that does not truncate.
EdSchouten (updater)
2015-11-10 11:58

I agree with Joerg.
steffen (reporter)
2015-11-10 13:08

Well, i don't - since if buffer resizing due to E2BIG is a regular case in a particular code flow then i would definitely do a (usually highly optimized) strlen() on the source buffer at first in order to cut down needless reallocations and second runs. Definitely.
No no. You'd use this function at places where you are pretty sure that the buffer is sufficiently spaced, or where it is a regular error condition if the buffer is too small -- in which case you surely don't want the complete input buffer to be traversed needlessly.
Speedier and more sensitive than snprintf(BUF, sizeof BUF, "%s", INPUT), which is often used instead (at a cost in time, and CPU cycles and thus also energy).
And maybe even with a debug log or even assert that triggers possible buffer overflows during development in the former case.
Quite unmasked that is: strlcpy() is a kind of "go, take the money and run"; and rob another bank if its out.
EdSchouten (updater)
2015-11-10 16:20


- strlcpy(), strlcat(), etc. have already been around for a very long time. They are used by quite a lot of existing Open Source software. Not just software that works on the BSDs, but also software that's designed to run on many other UNIX/non-UNIX operating systems.

- strscpy() has only been around for 6-7 months and is specific to Linux. More specifically, it's only available inside of the Linux kernel, and used exactly three times, all in one source file (arch/tile/gxio/mpipe.c). There is no other precedent at all. There is no strscat() either.

- strlcpy() fits within the existing set of functions like a glove. strlcpy(a, b, n) behaves identically to snprintf(a, n, "%s", b). The return value always corresponds to the number of non-null bytes that would have been written. If we truly think that this is bad design, should we come up with a new version of snprintf() that also doesn't do this? I don't think so.

- In its current form, strscpy() cannot be standardized for a couple of reasons:

1. As far as I know, there are exactly zero functions in POSIX that return negative error numbers. There are some functions that return error numbers directly, but never negated. It doesn't fit in style-wise.

2. The strscpy() function's return type is ssize_t. This is what the current <sys/types.h> article has to say on ssize_t:

"The type ssize_t shall be capable of storing values at least in the range [-1, {SSIZE_MAX}]."

In other words, it may return a value that is outside of bounds for the type that's used. Should we then introduce an SSIZE_MIN? I don't think so. That's not what ssize_t was created for.

Alternatively this could be repaired by changing the function to return -1 as you mentioned, but this leads to the following problem: it would mean that we'd be standardizing something that conflicts with existing implementations. If someone would write a piece of code in userspace that does this:

if (strscpy(....) == -1) {

And move that into the kernel, then you suddenly wouldn't detect truncation anymore. The same thing holds when moving code out of the kernel. And this is really not an uncommon scenario. Kernel developers do it all the time. For example, I designed FreeBSD's VT100/xterm terminal emulator entirely in userspace and only moved it into the kernel afterwards.

- The advantages of strscpy() over strlcpy() are *small*. If I read the email thread regarding its introduction, the (only?) advantage of this function over strlcpy() is that this piece of code:

if (strlcpy(a, b, some->really->long->expression->that->attempts->to->get->the->field->length) >= some->really->long->expression->that->attempts->to->get->the->field->length) {

can be rewritten to the following statement that is shorter:

if (strscpy(a, b, some->really->long->expression->that->attempts->to->get->the->field->length) < 0) {

I think this only provides very minimal gain. I just did a grep on the FreeBSD source tree and in almost all cases the expression that the function is compared against is compact. It's either a *_MAX constant from some header file or a sizeof() expression referring to an array on the stack or a struct member of low depth.
steffen (reporter)
2015-11-10 17:07

Dear Ed, first i liked cons25, fwiw.
Yes this is a mysterious post but i like the last sentence, do i. A remarkable amount of commits that simply changed existing code to the strl* family has happened in the FreeBSD tree since you've opened this issue, too, and i think that the Linux community is better served to use this new interface for new development only, instead of converting old code to this strl interface in a three line diff context. I don't know wether Linus Torvalds referred to this FreeBSD commit series when he explicitly pointed out that this is not desired for Linux and strscpy(). (Note i personally had a, possibly even odd feeling once those commit messages flew by, weeks before i have read that article.)

I could only repost the first two sentences of message 0002896 again to answer you. I see you have read the Linux commit message.. hm. Quite polemical, hm, hmm. Well.

  strlcpy(a, b, n) behaves identically to snprintf(a, n, "%s", b)

Fantastic! Let's just avoid redundancy.
Ah, what is a kernel anyway, the sun will blow up at times and then, after some more time, there is nothing but a black hole, at least maybe, and if we have enough mass. But i doubt the latter.
steffen (reporter)
2015-11-10 17:27

P.S.: Ach! on ISO C defining size_t without a signed counterpart.
Unfortunately i miss that sense of humour that is expressed with the -1..SSIZE_MAX definition. At least. No i'd throw it overboard and define it as the signed counterpart of size_t, regulary and officially, and then nice things would be possible, like returning "-dstsize - 1", not only for the above. That'd be a much more dense way of doing things, what do you mean. But at least the latter will remain a dream.
shware_systems (reporter)
2016-07-07 23:18

Adding these functions would require a sponsor and some proposed text. At the 20160707 call it was decided to ask the Open Group if they would be willing to sponsor the interfaces in the Desired Action.

- Issue History
Date Modified Username Field Change
2015-09-23 08:35 EdSchouten New Issue
2015-09-23 08:35 EdSchouten Status New => Under Review
2015-09-23 08:35 EdSchouten Assigned To => ajosey
2015-09-23 08:35 EdSchouten Name => Ed Schouten
2015-09-23 08:35 EdSchouten Organization => Nuxi
2015-09-23 08:35 EdSchouten Section => <string.h>
2015-09-23 08:35 EdSchouten Page Number => n/a
2015-09-23 08:35 EdSchouten Line Number => n/a
2015-09-23 08:46 EdSchouten Section <string.h> => <string.h> and <wchar.h>
2015-09-23 14:00 eblake Note Added: 0002843
2015-09-23 16:19 EdSchouten Note Added: 0002844
2015-10-01 05:01 shware_systems Note Added: 0002851
2015-10-09 14:00 joerg Note Added: 0002869
2015-11-09 20:57 steffen Note Added: 0002892
2015-11-10 11:33 joerg Note Added: 0002894
2015-11-10 11:58 EdSchouten Note Added: 0002895
2015-11-10 13:08 steffen Note Added: 0002896
2015-11-10 16:20 EdSchouten Note Added: 0002897
2015-11-10 17:07 steffen Note Added: 0002900
2015-11-10 17:27 steffen Note Added: 0002901
2016-07-07 23:18 shware_systems Note Added: 0003293
2016-09-14 14:25 emaste Issue Monitored: emaste

Mantis 1.1.6[^]
Copyright © 2000 - 2008 Mantis Group
Powered by Mantis Bugtracker