Anonymous | Login | 2024-12-12 17:12 UTC |
Main | My View | View Issues | Change Log | Docs |
Viewing Issue Simple Details [ Jump to Notes ] | [ Issue History ] [ Print ] | ||||||
ID | Category | Severity | Type | Date Submitted | Last Update | ||
0000986 | [1003.1(2008)/Issue 7] Base Definitions and Headers | Editorial | Clarification Requested | 2015-09-23 08:35 | 2024-06-11 08:52 | ||
Reporter | EdSchouten | View Status | public | ||||
Assigned To | ajosey | ||||||
Priority | normal | Resolution | Accepted As Marked | ||||
Status | Closed | ||||||
Name | Ed Schouten | ||||||
Organization | Nuxi | ||||||
User Reference | |||||||
Section | <string.h> and <wchar.h> | ||||||
Page Number | n/a | ||||||
Line Number | n/a | ||||||
Interp Status | --- | ||||||
Final Accepted Text | Note: 0005334 | ||||||
Summary | 0000986: Would it be worth investigating adding strlcpy(), strlcat(), wcslcpy() and wcslcat()? | ||||||
Description |
Back in 1998, OpenBSD 2.4 added the functions strlcpy() and strlcat(). These functions eventually made it into a whole bunch of other operating systems. By now, at least OpenBSD, FreeBSD, NetBSD, Solaris, Mac OS X, and QNX provide these functions. Implementations are also present in popular Open Source projects like SDL, GLib, ffmpeg and rsync. The Linux kernel also uses them internally. These functions have already been in use for the last 17 years and people seem to like them, which is why I'd like to propose that we add them in the next version of POSIX. It is important to mention that these functions do not come without any form of criticism. One important concern about these functions is that they may silently truncate the string if the output buffer is too small. My response to this would be the following: 1. Checking the return value allows you to detect truncation. 2. This is not different from other existing functions such as snprintf(), which seems to be preferred over sprintf() nowadays. 3. Even though string truncation is bad and could potentially lead to security vulnerabilities, it is important to take into account what the impact would have been if strcpy() and strcat() were used instead. At least strlcpy() and strlcat() would prevent the buffer overflow from happening, making it less likely that the integrity of the control flow of the program is affected. |
||||||
Desired Action | Please add them. :-) | ||||||
Tags | issue8 | ||||||
Attached Files | |||||||
|
Relationships | |||||||
|
Notes | |
(0002843) eblake (manager) 2015-09-23 14:00 |
Ulrich Drepper is no longer as active on developing either POSIX or glibc, but his scathing analysis of these interfaces is still true: the only correct way to use them requires MORE boilerplate than what you could do by using other existing interfaces: https://stackoverflow.com/questions/2114896/why-is-strlcpy-and-strlcat-considered-to-be-insecure [^]
As such, I'm not personally in favor of standardizing them, but am happy to let implementations continue to provide them as extensions. |
(0002844) EdSchouten (updater) 2015-09-23 16:19 |
Hi Eric, Thanks for the quick response! I agree. strlcat() does suffer from the issue that it allows people to write less efficient code more easily. Once you start to call strlcat() in a loop, you could create a piece of code that runs in quadratic time where linear time is possible. That said, I think it's important that we look at this problem from a different point of view. Let's put the ways you can copy/concatenate strings on an axis from 'bad' to 'good', based on safety, efficiency, etc.: strcpy()/strcat() <-----> strlcpy()/strlcat() <-----> strlen()+memcpy() Now the question becomes: if we don't think it's a good idea to add strlcpy()/strlcat() because strlen()+memcpy() are more efficient, why should we still standardize strcpy()/strcat()? As far as I know, POSIX is not a superset of the C standard[1], so there is nothing that prevents us from deprecating these functions. But my guess is that people would get quite upset if there's no standard function to copy strings. What I try to say is that we should think of strlcpy()/strlcat() as an incremental improvement over strcpy()/strcat(). It has never even tried to address the running time problem of C string concatenation. They have been designed so that existing code can be refactored to use these functions pretty easily, and practice has shown that they are good at this. Even if they end up truncating strings, it's still better than losing integrity of control flow due to a buffer overflow. If we don't think that strlcpy()/strlcat() are the right solution to the problem, would it make sense to rephrase this bug: can we come up with a set of functions that do tackle this problem the right way? [1] ctime() is deprecated in issue 7, making me assume it's going to be removed from issue 8 (?). It's still part of C11. |
(0002851) shware_systems (reporter) 2015-10-01 05:01 |
Re: 2844, Note [1] If Issue 8 defers to C11, ctime() will probably get the new LEGACY mark, not removed, or stay marked OBSOLETE. It becomes a candidate for removal in the issue following the C standard removing it, afaik. Couldn't say whether it's on the agenda now for SC22. |
(0002869) joerg (reporter) 2015-10-09 14:00 |
I see no verification for the claim that strlen() + memcpy() is faster than strlcpy(). strlcpy() is the most useful method to avoid buffer overruns and to detect truncation (in case you check the return value). strlcpy() is much faster than strncpy(). strlcpy() allows you to use the result to compute the amount of space needed for the target in an efficient way. The strl*() type functions are available on *BSD, OSX, Solaris since late 1998 (*BSD) and early 1999 (Solaris). I believe that this is a verification that many people believe they are a useful addition. Mr. Drepper is a single person that does not seem to discuss his claims with others. I cannot see why the claims for strl*() from Mr. Drepper should include a valid point. |
(0002892) steffen (reporter) 2015-11-09 20:57 |
Since this is a new interface for the standard i think it would be worth having a look at strscpy() that the Linux community now seems to advocate for future development; it effectively is ssize_t strscpy(char *dst, char const *src, size_t dstsize){ ssize_t rv; if(LIKELY(dstsize > 0)){ rv = 0; do{ if((dst[rv] = src[rv]) == '\0') goto jleave; ++rv; }while(--dstsize > 0); dst[--rv] = '\0'; } rv = -E2BIG; jleave: return rv; } Nice properties: no zero padding, dst always terminated if dstsize greater 0 - what strncpy() should have been from the beginning in my not so humble opinion. -E2BIG is possibly a bit strange in user space, -1 is a common error value in POSIX, or -dstsize for the very conservative under us (with the potential to save a temporary / not wasting a possible return register). A strscat() in equal spirit does not yet exist, but is thinkable. |
(0002894) joerg (reporter) 2015-11-10 11:33 |
This strscpy() proposal looks less usable than strlcpy() as it does not return the needed size for a copy operation that does not truncate. |
(0002895) EdSchouten (updater) 2015-11-10 11:58 |
I agree with Joerg. |
(0002896) steffen (reporter) 2015-11-10 13:08 |
Well, i don't - since if buffer resizing due to E2BIG is a regular case in a particular code flow then i would definitely do a (usually highly optimized) strlen() on the source buffer at first in order to cut down needless reallocations and second runs. Definitely. No no. You'd use this function at places where you are pretty sure that the buffer is sufficiently spaced, or where it is a regular error condition if the buffer is too small -- in which case you surely don't want the complete input buffer to be traversed needlessly. Speedier and more sensitive than snprintf(BUF, sizeof BUF, "%s", INPUT), which is often used instead (at a cost in time, and CPU cycles and thus also energy). And maybe even with a debug log or even assert that triggers possible buffer overflows during development in the former case. Quite unmasked that is: strlcpy() is a kind of "go, take the money and run"; and rob another bank if its out. |
(0002897) EdSchouten (updater) 2015-11-10 16:20 |
Steffen, - strlcpy(), strlcat(), etc. have already been around for a very long time. They are used by quite a lot of existing Open Source software. Not just software that works on the BSDs, but also software that's designed to run on many other UNIX/non-UNIX operating systems. - strscpy() has only been around for 6-7 months and is specific to Linux. More specifically, it's only available inside of the Linux kernel, and used exactly three times, all in one source file (arch/tile/gxio/mpipe.c). There is no other precedent at all. There is no strscat() either. - strlcpy() fits within the existing set of functions like a glove. strlcpy(a, b, n) behaves identically to snprintf(a, n, "%s", b). The return value always corresponds to the number of non-null bytes that would have been written. If we truly think that this is bad design, should we come up with a new version of snprintf() that also doesn't do this? I don't think so. - In its current form, strscpy() cannot be standardized for a couple of reasons: 1. As far as I know, there are exactly zero functions in POSIX that return negative error numbers. There are some functions that return error numbers directly, but never negated. It doesn't fit in style-wise. 2. The strscpy() function's return type is ssize_t. This is what the current <sys/types.h> article has to say on ssize_t: "The type ssize_t shall be capable of storing values at least in the range [-1, {SSIZE_MAX}]." In other words, it may return a value that is outside of bounds for the type that's used. Should we then introduce an SSIZE_MIN? I don't think so. That's not what ssize_t was created for. Alternatively this could be repaired by changing the function to return -1 as you mentioned, but this leads to the following problem: it would mean that we'd be standardizing something that conflicts with existing implementations. If someone would write a piece of code in userspace that does this: if (strscpy(....) == -1) { ... } And move that into the kernel, then you suddenly wouldn't detect truncation anymore. The same thing holds when moving code out of the kernel. And this is really not an uncommon scenario. Kernel developers do it all the time. For example, I designed FreeBSD's VT100/xterm terminal emulator entirely in userspace and only moved it into the kernel afterwards. - The advantages of strscpy() over strlcpy() are *small*. If I read the email thread regarding its introduction, the (only?) advantage of this function over strlcpy() is that this piece of code: if (strlcpy(a, b, some->really->long->expression->that->attempts->to->get->the->field->length) >= some->really->long->expression->that->attempts->to->get->the->field->length) { ... } can be rewritten to the following statement that is shorter: if (strscpy(a, b, some->really->long->expression->that->attempts->to->get->the->field->length) < 0) { ... } I think this only provides very minimal gain. I just did a grep on the FreeBSD source tree and in almost all cases the expression that the function is compared against is compact. It's either a *_MAX constant from some header file or a sizeof() expression referring to an array on the stack or a struct member of low depth. |
(0002900) steffen (reporter) 2015-11-10 17:07 |
Dear Ed, first i liked cons25, fwiw. Yes this is a mysterious post but i like the last sentence, do i. A remarkable amount of commits that simply changed existing code to the strl* family has happened in the FreeBSD tree since you've opened this issue, too, and i think that the Linux community is better served to use this new interface for new development only, instead of converting old code to this strl interface in a three line diff context. I don't know wether Linus Torvalds referred to this FreeBSD commit series when he explicitly pointed out that this is not desired for Linux and strscpy(). (Note i personally had a, possibly even odd feeling once those commit messages flew by, weeks before i have read that article.) I could only repost the first two sentences of message 0002896 again to answer you. I see you have read the Linux commit message.. hm. Quite polemical, hm, hmm. Well. strlcpy(a, b, n) behaves identically to snprintf(a, n, "%s", b) Fantastic! Let's just avoid redundancy. Ah, what is a kernel anyway, the sun will blow up at times and then, after some more time, there is nothing but a black hole, at least maybe, and if we have enough mass. But i doubt the latter. |
(0002901) steffen (reporter) 2015-11-10 17:27 |
P.S.: Ach! on ISO C defining size_t without a signed counterpart. Unfortunately i miss that sense of humour that is expressed with the -1..SSIZE_MAX definition. At least. No i'd throw it overboard and define it as the signed counterpart of size_t, regulary and officially, and then nice things would be possible, like returning "-dstsize - 1", not only for the above. That'd be a much more dense way of doing things, what do you mean. But at least the latter will remain a dream. |
(0003293) shware_systems (reporter) 2016-07-07 23:18 |
Adding these functions would require a sponsor and some proposed text. At the 20160707 call it was decided to ask the Open Group if they would be willing to sponsor the interfaces in the Desired Action. |
(0004967) geoffclare (manager) 2020-09-03 10:45 edited on: 2020-09-03 13:59 |
Suggested changes to go into The Open Group company review... Page and line numbers are for the 2016/2018 edition. On page 363 line 12410 section <string.h>, add: [CX]size_t strlcat(char *restrict, const char *restrict, size_t); size_t strlcpy(char *restrict, const char *restrict, size_t);[/CX] On page 364 line 12437 section <string.h>, add strlcat() to SEE ALSO. On page 460 line 15893 section <wchar.h>, add: [CX]size_t wcslcat(wchar_t *restrict, const wchar_t *restrict, size_t); size_t wcslcpy(wchar_t *restrict, const wchar_t *restrict, size_t);[/CX] On page 461 line 15955 section <wchar.h>, add wcslcat() to SEE ALSO. On page 494 line 17097-17146 section 2.4.3, add strlcat(), strlcpy(), wcslcat(), and wcslcpy() to the table of async-signal-safe functions. On page 2053 insert a new strlcat page: NAME strlcat, strlcpy -- size-bounded string concatenation and copying SYNOPSIS #include <string.h> [CX]size_t strlcat(char *restrict dst, const char *restrict src, size_t dstsize); size_t strlcpy(char *restrict dst, const char *restrict src, size_t dstsize);[/CX] DESCRIPTION The strlcpy() and strlcat() functions copy and concatenate strings, stopping when either a NUL terminator in the source string is encountered or the specified full size of the destination buffer is reached. They NUL terminate the result if there is room. The application should ensure that room for the NUL terminator is included in dstsize. RETURN VALUE Upon successful completion, the strlcpy() function shall return the length of the string pointed to by src; that is, the number of bytes in the string, not including the terminating NUL byte. ERRORS No errors are defined. EXAMPLES The following example detects truncation while combining a path prefix (including trailing <slash>) and a filename to produce a portable pathname:char *prefix, *filenam, pathnam[_POSIX_PATH_MAX]; if (strlcpy(pathnam, prefix, sizeof pathnam) >= sizeof pathnam || strlcat(pathnam, filenam, sizeof pathnam) >= sizeof pathnam) { // truncation occurred ... } APPLICATION USAGE The return value of the strlcpy() and strlcat() functions follows the same convention as snprintf(); that is, they return the total length of the string they tried to create. If the return value is greater than or equal to dstsize, the output string has been truncated. RATIONALE None. FUTURE DIRECTIONS None. SEE ALSO snprintf(), strlen(), strncat(), strncpy(), wcslcat() CHANGE HISTORY First released in Issue 8. Add strlcat() to the SEE ALSO section for each existing function page listed in the strlcat() SEE ALSO above. On page 2256 insert a new wcslcat page: NAME wcslcat, wcslcpy -- size-bounded wide string concatenation and copying SYNOPSIS #include <wchar.h> [CX]size_t wcslcat(wchar_t *restrict dst, const wchar_t *restrict src, size_t dstsize); size_t wcslcpy(wchar_t *restrict dst, const wchar_t *restrict src, size_t dstsize);[/CX] DESCRIPTION The wcslcpy() and wcslcat() functions copy and concatenate wide strings, stopping when either a terminating null wide-character code in the source wide string is encountered or the specified full size (in wide-character codes) of the destination buffer is reached. They null terminate the result if there is room. The application should ensure that room for the terminating null wide-character code is included in dstsize. RETURN VALUE Upon successful completion, the wcslcpy() function shall return the length of the wide string pointed to by src; that is, the number of wide-character codes in the wide string, not including the terminating null wide-character code. ERRORS No errors are defined. EXAMPLES None. APPLICATION USAGE The return value of the wcslcpy() and wcslcat() functions follows the same convention as snprintf(); that is, they return the total length (in wide-character codes) of the wide string they tried to create. If the return value is greater than or equal to dstsize, the output wide string has been truncated. RATIONALE None. FUTURE DIRECTIONS None. SEE ALSO snprintf(), strlcat(), wcslen(), wcsncat(), wcsncpy() CHANGE HISTORY First released in Issue 8. Add wcslcat() to the SEE ALSO section for each existing function page listed in the wcslcat() SEE ALSO above. On page 3790 line 130046 section E.1, add wcslcat() and wcslcpy() to the POSIX_C_LANG_WIDE_CHAR_EXT subprofile group. On page 3790 line 130049 section E.1, add strlcat() and strlcpy() to the POSIX_C_LIB_EXT subprofile group. |
(0005050) geoffclare (manager) 2020-10-16 09:29 |
The additions for these four functions have been made in the Issue8NewAPIs branch in gitlab, based on Note: 0004967. |
(0005334) geoffclare (manager) 2021-04-29 15:17 |
Make the changes from "Additional APIs for Issue 8, Part 1" (Austin/1110). |
Issue History | |||
Date Modified | Username | Field | Change |
2015-09-23 08:35 | EdSchouten | New Issue | |
2015-09-23 08:35 | EdSchouten | Status | New => Under Review |
2015-09-23 08:35 | EdSchouten | Assigned To | => ajosey |
2015-09-23 08:35 | EdSchouten | Name | => Ed Schouten |
2015-09-23 08:35 | EdSchouten | Organization | => Nuxi |
2015-09-23 08:35 | EdSchouten | Section | => <string.h> |
2015-09-23 08:35 | EdSchouten | Page Number | => n/a |
2015-09-23 08:35 | EdSchouten | Line Number | => n/a |
2015-09-23 08:46 | EdSchouten | Section | <string.h> => <string.h> and <wchar.h> |
2015-09-23 14:00 | eblake | Note Added: 0002843 | |
2015-09-23 16:19 | EdSchouten | Note Added: 0002844 | |
2015-10-01 05:01 | shware_systems | Note Added: 0002851 | |
2015-10-09 14:00 | joerg | Note Added: 0002869 | |
2015-11-09 20:57 | steffen | Note Added: 0002892 | |
2015-11-10 11:33 | joerg | Note Added: 0002894 | |
2015-11-10 11:58 | EdSchouten | Note Added: 0002895 | |
2015-11-10 13:08 | steffen | Note Added: 0002896 | |
2015-11-10 16:20 | EdSchouten | Note Added: 0002897 | |
2015-11-10 17:07 | steffen | Note Added: 0002900 | |
2015-11-10 17:27 | steffen | Note Added: 0002901 | |
2016-07-07 23:18 | shware_systems | Note Added: 0003293 | |
2016-09-14 14:25 | emaste | Issue Monitored: emaste | |
2020-09-03 10:45 | geoffclare | Note Added: 0004967 | |
2020-09-03 10:47 | geoffclare | Note Edited: 0004967 | |
2020-09-03 10:49 | geoffclare | Note Edited: 0004967 | |
2020-09-03 13:52 | geoffclare | Note Edited: 0004967 | |
2020-09-03 13:59 | geoffclare | Note Edited: 0004967 | |
2020-10-16 09:29 | geoffclare | Note Added: 0005050 | |
2021-04-29 15:17 | geoffclare | Note Added: 0005334 | |
2021-04-29 15:18 | geoffclare | Interp Status | => --- |
2021-04-29 15:18 | geoffclare | Final Accepted Text | => Note: 0005334 |
2021-04-29 15:18 | geoffclare | Status | Under Review => Resolved |
2021-04-29 15:18 | geoffclare | Resolution | Open => Accepted As Marked |
2021-04-29 15:18 | geoffclare | Tag Attached: issue8 | |
2021-05-07 15:18 | geoffclare | Status | Resolved => Applied |
2022-06-29 16:05 | Florian Weimer | Issue Monitored: Florian Weimer | |
2022-06-30 08:39 | geoffclare | Relationship added | related to 0001591 |
2024-06-11 08:52 | agadmin | Status | Applied => Closed |
Mantis 1.1.6[^] Copyright © 2000 - 2008 Mantis Group |