0001061: Please add memmem() (and maybe wmemmem())

ID	Project	Category	View Status	Date Submitted	Last Update

0001061	1003.1(2008)/Issue 7	Base Definitions and Headers	public	2016-07-07 21:09	2024-06-11 08:52

Reporter	EdSchouten	Assigned To	ajosey
Priority	normal	Severity	Editorial	Type	Enhancement Request
Status	Closed	Resolution	Accepted As Marked

Name	Ed Schouten
Organization	Nuxi, the Netherlands
User Reference
Section	string.h / wchar.h
Page Number	-
Line Number	-
Interp Status	---
Final Accepted Text	0001061:0005336


Summary	0001061: Please add memmem() (and maybe wmemmem())
Description	A decent implementation of the strstr()/wcsstr() function is rather complex. In the old days, implementations typically had to make a trade-off between in-place quadratic time algorithms (simple scanning) or linear-time/space algorithms (e.g. Knuth-Morris-Pratt). If I'm correct, an in-place linear worst-case time algorithm is only known since the 90s (Two-way string-matching). Its pseudo-code alone is already 50 lines. The problem with strstr()/wcsstr() is that it assumes that the input strings are null terminated, which isn't always the case. This is why many operating systems (Linux, all the BSDs, Mac OS X) provide a memmem() function as well, which can reuse the same algorithm. There shouldn't be a need to handroll such a function yourself.
Desired Action	Please standardize the existing memmem() function. While there, maybe we should also add wmemmem() for consistency.
Tags	issue8
Attached Files	memmem.txt (1,492 bytes) ---------- Add the following new section for memmem(): --------- NAME memmem - find a byte substring in a byte string SYNOPSIS [CX] #include <string.h> void memmem(const void haystack, size_t haystacklen, const void needle, size_t needlelen); [/CX] DESCRIPTION The memmem() function shall locate the first occurrence of byte string 'needle' of length 'needlelen' in byte string 'haystack' of length 'haystacklen'. RETURN VALUE Upon successful completion, memmem() shall return a pointer to the the first byte of the located byte string in 'haystack', or a null pointer if the byte string is not found. If 'needlelen' is zero, the function shall return 'haystack'. If 'haystacklen' is less than 'needlelen', the function shall return a null pointer. ERRORS No errors are defined. --- The following sections are informative. -- EXAMPLES None. APPLICATION USAGE None. RATIONALE This function is identical to strstr(), except that it doesn't require that strings have a terminating NUL character. FUTURE DIRECTIONS None. SEE ALSO memchr, strstr XBD <string.h> CHANGE HISTORY First released in Issue 8. --- End of informative text. --- ---------- Add the following function prototype to <string.h>: --------- [CX] void memmem(const void , size_t, const void , size_t); [/CX] ---------- Add the following to strstr(): --------- SEE ALSO memmem() memmem.txt (1,492 bytes)

EdSchouten 2017-08-15 07:50 reporter bugnote:0003817	[ Sorry if you received a partial response; I accidentally clicked the submit button ] Hi Andrew, As you requested, attached to this ticket you may find formatted text for inclusion into the standard. It is loosely based on the existing documentation of strstr(), except that I've decided to rename 's1' and 's2' to 'haystack' and 'needle' to prevent confusion. Be sure to get in touch in case there are things you want me to clarify/improve. Ed

rhansen 2017-10-26 16:19 manager bugnote:0003871 Last edited: 2017-10-26 16:27	This was discussed during the 2017-10-26 telecon. We have no problem adding memmem() to the standard, but since this is a new interface, we need a sponsor. There is no consensus to add wmemmem(). The following wording was agreed upon: In XSH chapter 3, insert a new entry for memmem(): NAME memmem — find a byte subsequence in a byte sequence SYNOPSIS [CX] #include <string.h> void memmem(const void haystack, size_t haystacklen, const void needle, size_t needlelen); [/CX] DESCRIPTION* The memmem() function shall locate the first occurrence of byte sequence needle of length needlelen in byte sequence haystack of length haystacklen. RETURN VALUE Upon successful completion, memmem() shall return a pointer to the the first byte of the located byte sequence in haystack, or a null pointer if the byte sequence is not found. If needlelen is zero, the function shall return haystack. If haystacklen is less than needlelen, the function shall return a null pointer. ERRORS No errors are defined. --- The following sections are informative. -- EXAMPLES None. APPLICATION USAGE None. RATIONALE This function is similar to strstr(), except that NUL bytes may be included in either needle or haystack. FUTURE DIRECTIONS None. SEE ALSO memchr(), strstr() XBD <string.h> CHANGE HISTORY First released in Issue 8. --- End of informative text. --- On page 363 after line 12395 (XBD <string.h> DESCRIPTION), insert: [CX] void memmem(const void , size_t, const void , size_t); [/CX] On page 2071 line 66440 (XSH strstr() SEE ALSO), add memmem*().

geoffclare 2020-10-16 09:39 manager bugnote:0005052	The memmem() additions have been made in the Issue8NewAPIs branch in gitlab, based on 0001061:0003871. It was also added to the POSIX_C_LIB_EXT subprofile group in XRAT E.1.

geoffclare 2021-04-29 15:21 manager bugnote:0005336	Make the changes from "Additional APIs for Issue 8, Part 1" (Austin/1110).

Date Modified	Username	Field	Change
2016-07-07 21:09	EdSchouten	New Issue
2016-07-07 21:09	EdSchouten	Status	New => Under Review
2016-07-07 21:09	EdSchouten	Assigned To	=> ajosey
2016-07-07 21:09	EdSchouten	Name	=> Ed Schouten
2016-07-07 21:09	EdSchouten	Organization	=> Nuxi, the Netherlands
2016-07-07 21:09	EdSchouten	Section	=> string.h / wchar.h
2016-07-07 21:09	EdSchouten	Page Number	=> -
2016-07-07 21:09	EdSchouten	Line Number	=> -
2017-08-15 07:47	EdSchouten	File Added: memmem.txt
2017-08-15 07:50	EdSchouten	Note Added: 0003817
2017-10-26 16:19	rhansen	Note Added: 0003871
2017-10-26 16:20	rhansen	Note Edited: 0003871
2017-10-26 16:21	rhansen	Note Edited: 0003871
2017-10-26 16:21	rhansen	Note Edited: 0003871
2017-10-26 16:23	rhansen	Note Edited: 0003871
2017-10-26 16:27	rhansen	Note Edited: 0003871
2020-10-16 09:39	geoffclare	Note Added: 0005052
2021-04-29 15:21	geoffclare	Note Added: 0005336
2021-04-29 15:22	geoffclare	Interp Status	=> ---
2021-04-29 15:22	geoffclare	Final Accepted Text	=> 0001061:0005336
2021-04-29 15:22	geoffclare	Status	Under Review => Resolved
2021-04-29 15:22	geoffclare	Resolution	Open => Accepted As Marked
2021-04-29 15:22	geoffclare	Tag Attached: issue8
2021-05-07 15:24	geoffclare	Status	Resolved => Applied
2024-06-11 08:52	agadmin	Status	Applied => Closed

View Issue Details

Activities

Issue History