View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
0000793 | 1003.1(2013)/Issue7+TC1 | Base Definitions and Headers | public | 2013-11-15 14:37 | 2025-03-04 14:57 |
Reporter | steffen | Assigned To | ajosey | ||
Priority | normal | Severity | Editorial | Type | Enhancement Request |
Status | Closed | Resolution | Accepted | ||
Name | steffen | ||||
Organization | |||||
User Reference | |||||
Section | Vol 1. 9.4.6, Vol. 1. 13. regex.h, Vol 2. regcomp(), Vol. 4. A.9.2 | ||||
Page Number | 190, 322, 1783, 3501 | ||||
Line Number | 6195, 10781, 57428, 118310 | ||||
Interp Status | --- | ||||
Final Accepted Text | |||||
Summary | 0000793: Regular Expressions: add REG_MINIMAL and a minimum repitition modifier | ||||
Description | The current POSIX regular expressions only offer a very restricted set of functionality, which forces many, if not most, real-life programs to use external regular expressions libraries which add features like non-greediness, positive and negative lookaround assertions and Unicode compatibility. Some Open Group members already ship with regular expression facilities which support at least some of the extensions, and there exist long-proven, stable and free (also for commercial use, BSD-licensed), almost drop-in, alternatives which can be used by the others (see the mailing list for some references). | ||||
Desired Action | - Vol. 1: Base Definitions, Chapter 9, «Regular Expressions». 9.4.6 EREs Matching Multiple Characters, p. 190, line 6195: insert after 6. Each of the duplication symbols (’+’, ’*’, ’?’, and intervals) may be suffixed by the minimal repitition modifier ’?’ <question-mark>, in which case matching behaviour is changed from the «leftmost longest possible match» to the «leftmost shortest possible match», including the null match (see [reference to A.9, p. 3500 ff.]). For example, the ERE ".*c" matches the last character (’c’) in the string "abc abc", whereas the ERE ".*?c" matches the first character ’c’, the third character in the string. If the REG_MINIMAL flag, as defined in the <regex.h>[REF] header, is used when compiling an ERE via regcomp(3)[REF], the «leftmost shortest possible match» is the default, and the minimal repitition modifier ’?’ can be used to select the «leftmost longest possible match». change, on (current) line 6195 ff., The behavior of multiple adjacent duplication symbols (’+’, ’*’, ’?’, and intervals) produces undefined results. to The behavior of multiple adjacent duplication symbols (’+’, ’*’, ’?’, and intervals, possibly suffixed by the minimal repitition modifier) produces undefined results. - Vol. 1: Base Definitions, Chapter 13, «Headers». On p. 322, line 10781 insert after REG_MINIMAL Change default matching behaviour to »leftmost shortest possible match». Only applicable to REG_EXTENDED regular expressions. - Vol. 2: System Interfaces. On p. 1783, line 57428 insert after REG_MINIMAL Change default matching behaviour to »leftmost shortest possible match». Only applicable to REG_EXTENDED regular expressions. - Vol. 4: Rationale (Informative), A.9.2 «Regular Expression General Requirements». On p. 3501, line 118310 insert after EREs can optionally use a «leftmost-shortest» rule (enabled via the REG_MINIMAL flag and/or the ’?’ minimal repitition modifier), in which case the «shortest possible matching prefix» is instead identified as the matching sequence. | ||||
Tags | issue8 |
|
It is not possible to make changes to an approved TC and the page and line numbers don't match TC1 either. This bug has been moved from project 2008-TC1 with category Rationale to project 1003.1(2013)/Issue7+TC1 with category Base Definitions and Headers. |
|
When applying this bug I spotted that "up to" was missing from the item 6 addition to 9.4.6 and inserted it. The source currently has: For example, the ERE .sG ".*c" matches up to the last character (\c .cH c ) in the string .sG "abc abc" , whereas the ERE .sG ".*?c" matches up to the first character .cH c , the third character in the string. (I also fixed the spelling of "repetition"). |
Date Modified | Username | Field | Change |
---|---|---|---|
2013-11-15 14:37 | steffen | New Issue | |
2013-11-15 14:37 | steffen | Status | New => Under Review |
2013-11-15 14:37 | steffen | Assigned To | => ajosey |
2013-11-15 14:37 | steffen | Name | => steffen |
2013-11-15 14:37 | steffen | Section | => Vol 1. 9.4.6, Vol. 1. 13. regex.h, Vol 2. regcomp(), Vol. 4. A.9.2 |
2013-11-15 14:37 | steffen | Page Number | => 190, 322, 1783, 3501 |
2013-11-15 14:37 | steffen | Line Number | => 6195, 10781, 57428, 118310 |
2013-12-21 20:48 | Don Cragun | Project | 2008-TC1 => 1003.1(2013)/Issue7+TC1 |
2013-12-21 20:56 | Don Cragun | Interp Status | => --- |
2013-12-21 20:56 | Don Cragun | Note Added: 0002092 | |
2013-12-21 20:56 | Don Cragun | Category | Rationale => Base Definitions and Headers |
2014-01-09 17:15 | Don Cragun | Status | Under Review => Resolved |
2014-01-09 17:15 | Don Cragun | Resolution | Open => Accepted |
2014-01-09 17:15 | Don Cragun | Tag Attached: issue8 | |
2020-03-25 15:58 | geoffclare | Status | Resolved => Applied |
2020-03-29 23:03 | Don Cragun | Relationship added | parent of 0001329 |
2020-03-30 08:40 | geoffclare | Note Added: 0004807 | |
2024-06-11 09:02 | agadmin | Status | Applied => Closed |