View Issue Details

IDProjectCategoryView StatusLast Update
00010611003.1(2008)/Issue 7Base Definitions and Headerspublic2024-06-11 08:52
ReporterEdSchouten Assigned Toajosey  
PrioritynormalSeverityEditorialTypeEnhancement Request
Status ClosedResolutionAccepted As Marked 
NameEd Schouten
OrganizationNuxi, the Netherlands
User Reference
Sectionstring.h / wchar.h
Page Number-
Line Number-
Interp Status---
Final Accepted Text0001061:0005336
Summary0001061: Please add memmem() (and maybe wmemmem())
DescriptionA decent implementation of the strstr()/wcsstr() function is rather complex. In the old days, implementations typically had to make a trade-off between in-place quadratic time algorithms (simple scanning) or linear-time/space algorithms (e.g. Knuth-Morris-Pratt). If I'm correct, an in-place linear worst-case time algorithm is only known since the 90s (Two-way string-matching). Its pseudo-code alone is already 50 lines.

The problem with strstr()/wcsstr() is that it assumes that the input strings are null terminated, which isn't always the case. This is why many operating systems (Linux, all the BSDs, Mac OS X) provide a memmem() function as well, which can reuse the same algorithm. There shouldn't be a need to handroll such a function yourself.
Desired ActionPlease standardize the existing memmem() function. While there, maybe we should also add wmemmem() for consistency.
Tagsissue8
Attached Files
memmem.txt (1,492 bytes)   
---------- Add the following new section for memmem(): ---------

NAME

    memmem - find a byte substring in a byte string

SYNOPSIS

    [CX]
    #include <string.h>

    void *memmem(const void *haystack, size_t haystacklen,
                 const void *needle, size_t needlelen);
    [/CX]

DESCRIPTION

    The memmem() function shall locate the first occurrence of byte
    string 'needle' of length 'needlelen' in byte string 'haystack' of
    length 'haystacklen'.

RETURN VALUE

    Upon successful completion, memmem() shall return a pointer to the
    the first byte of the located byte string in 'haystack', or a null
    pointer if the byte string is not found.

    If 'needlelen' is zero, the function shall return 'haystack'.

    If 'haystacklen' is less than 'needlelen', the function shall return
    a null pointer.

ERRORS

    No errors are defined.

--- The following sections are informative. --

EXAMPLES

    None.

APPLICATION USAGE

    None.

RATIONALE

    This function is identical to strstr(), except that it doesn't
    require that strings have a terminating NUL character.

FUTURE DIRECTIONS

    None.

SEE ALSO

    memchr, strstr

    XBD <string.h>

CHANGE HISTORY

    First released in Issue 8.

--- End of informative text. ---

---------- Add the following function prototype to <string.h>: ---------

[CX]
void *memmem(const void *, size_t, const void *, size_t);
[/CX]

---------- Add the following to strstr(): ---------

SEE ALSO

    memmem()
memmem.txt (1,492 bytes)   

Activities

EdSchouten

2017-08-15 07:50

reporter   bugnote:0003817

[ Sorry if you received a partial response; I accidentally clicked the submit button ]

Hi Andrew,

As you requested, attached to this ticket you may find formatted text for inclusion into the standard. It is loosely based on the existing documentation of strstr(), except that I've decided to rename 's1' and 's2' to 'haystack' and 'needle' to prevent confusion.

Be sure to get in touch in case there are things you want me to clarify/improve.

Ed

rhansen

2017-10-26 16:19

manager   bugnote:0003871

Last edited: 2017-10-26 16:27

This was discussed during the 2017-10-26 telecon. We have no problem adding memmem() to the standard, but since this is a new interface, we need a sponsor. There is no consensus to add wmemmem().

The following wording was agreed upon:

In XSH chapter 3, insert a new entry for memmem():
NAME
memmem — find a byte subsequence in a byte sequence


SYNOPSIS
[CX]
#include <string.h>

void *memmem(const void *haystack, size_t haystacklen,
             const void *needle, size_t needlelen);
[/CX]


DESCRIPTION
The memmem() function shall locate the first occurrence of byte sequence needle of length needlelen in byte sequence haystack of length haystacklen.


RETURN VALUE
Upon successful completion, memmem() shall return a pointer to the the first byte of the located byte sequence in haystack, or a null pointer if the byte sequence is not found.

If needlelen is zero, the function shall return haystack.

If haystacklen is less than needlelen, the function shall return a null pointer.


ERRORS
No errors are defined.


--- The following sections are informative. --

EXAMPLES
None.


APPLICATION USAGE
None.


RATIONALE
This function is similar to strstr(), except that NUL bytes may be included in either needle or haystack.


FUTURE DIRECTIONS
None.


SEE ALSO
memchr(), strstr()
XBD <string.h>


CHANGE HISTORY
First released in Issue 8.
--- End of informative text. ---


On page 363 after line 12395 (XBD <string.h> DESCRIPTION), insert:
[CX]
void *memmem(const void *, size_t, const void *, size_t);
[/CX]


On page 2071 line 66440 (XSH strstr() SEE ALSO), add memmem().

geoffclare

2020-10-16 09:39

manager   bugnote:0005052

The memmem() additions have been made in the Issue8NewAPIs branch in gitlab, based on 0001061:0003871.

It was also added to the POSIX_C_LIB_EXT subprofile group in XRAT E.1.

geoffclare

2021-04-29 15:21

manager   bugnote:0005336

Make the changes from "Additional APIs for Issue 8, Part 1" (Austin/1110).

Issue History

Date Modified Username Field Change
2016-07-07 21:09 EdSchouten New Issue
2016-07-07 21:09 EdSchouten Status New => Under Review
2016-07-07 21:09 EdSchouten Assigned To => ajosey
2016-07-07 21:09 EdSchouten Name => Ed Schouten
2016-07-07 21:09 EdSchouten Organization => Nuxi, the Netherlands
2016-07-07 21:09 EdSchouten Section => string.h / wchar.h
2016-07-07 21:09 EdSchouten Page Number => -
2016-07-07 21:09 EdSchouten Line Number => -
2017-08-15 07:47 EdSchouten File Added: memmem.txt
2017-08-15 07:50 EdSchouten Note Added: 0003817
2017-10-26 16:19 rhansen Note Added: 0003871
2017-10-26 16:20 rhansen Note Edited: 0003871
2017-10-26 16:21 rhansen Note Edited: 0003871
2017-10-26 16:21 rhansen Note Edited: 0003871
2017-10-26 16:23 rhansen Note Edited: 0003871
2017-10-26 16:27 rhansen Note Edited: 0003871
2020-10-16 09:39 geoffclare Note Added: 0005052
2021-04-29 15:21 geoffclare Note Added: 0005336
2021-04-29 15:22 geoffclare Interp Status => ---
2021-04-29 15:22 geoffclare Final Accepted Text => 0001061:0005336
2021-04-29 15:22 geoffclare Status Under Review => Resolved
2021-04-29 15:22 geoffclare Resolution Open => Accepted As Marked
2021-04-29 15:22 geoffclare Tag Attached: issue8
2021-05-07 15:24 geoffclare Status Resolved => Applied
2024-06-11 08:52 agadmin Status Applied => Closed