View Issue Details

IDProjectCategoryView StatusLast Update
0001575Issue 8 draftsBase Definitions and Headerspublic2022-04-08 08:51
Reportercalestyo Assigned To 
PrioritynormalSeverityEditorialTypeEnhancement Request
Status ClosedResolutionWithdrawn 
Product VersionDraft 2.1 
NameChristoph Anton Mitterer
Organization
User Reference
Section9.3.5 RE Bracket Expression
Page Number169
Line Number5872
Final Accepted Text
Summary0001575: imrpove indication that [^] as a bracket expression is not valid
DescriptionHey.

When working on issue #1551 I stumbled over the question that it's a bit vague, whether regular expressions allow a bracket expression that consists of a single <circumflex>, i.e. '[^]'... which should supposedly match the literal <circumflex>.

I'd say line 5872:
"A matching list expression specifies a list that shall match any single character that is matched by one of the expressions represented in the list. The first character in the list cannot be the <circumflex>."

kinda implies that it's *not* possible, because that item "2." describes the matching list, and it says "The first character in the list cannot be the <circumflex>.".
Desired ActionReplace page 169, line 5873:

"The first character in the list cannot be the <circumflex>."

with:

"The first or only character in the list cannot be the <circumflex>."



I wonder whether this is enough. cause it would only be in the section that describes "matching lists".
What if one e.g. considers '[^]' a non-matching list?

That I think however is already ruled out earlier, by page 168, line 5833, which says:
"A bracket expression is either a matching list expression or a non-matching list expression. It consists of one or more expressions: ordinary characters, collating elements, collating symbols, equivalence classes, character classes, or range expressions."

Thus a non-matching list would need to consist of "one or more <something>", whereas '[^]' (as a thought non-matching list) would miss that <something>.
TagsNo tags attached.

Activities

geoffclare

2022-04-05 14:32

manager   bugnote:0005782

[^] is not a valid bracket expression because it is missing the terminating ']'.

[^]] is a valid bracket expression that matches any character except ']'.

See XBD 9.3.5 item 1:
The <right-square-bracket> (']') shall lose its special meaning and represent itself in a bracket expression if it occurs first in the list (after an initial <circumflex> ('ˆ'), if any).

calestyo

2022-04-07 22:55

reporter   bugnote:0005787

Hmm. I actually had things like []] in mind when asking for the above, cause a non-expert reader could (when not reading everything or understanding its subtle implications) come to the idea that [^] might be a valid special case of a bracket expression, just like []] is.

Of course it is not (which I was trying to point even more with my proposed change).

But you're right, since [^] is not only not equivalent to '\^' but not even a valid bracket expression at all,... my above proposal wouldn't make sense.


We could add a hint like "([^] alone isn't a valid bracket expression)" to the paragraph at line 5872.
But I'd agree that this is not necessary.


Unless you feel different, please close this issue and sorry for the noise.



btw and unrelated:
- In 9.3.2 BRE Ordinary Characters, "When not inside a bracket expression, the interpretation of an ordinary character preceded by an unescaped <backslash> is undefined, except for:"
doesn't list '^' and '$' which, as per 9.3.8 BRE Expression Anchoring may follow a '\'.
(And perhaps the same for EREs.)

- 9.3.8 BRE Expression Anchoring, point 1 mentions that a portable BRE, shall escape a subexpressions leading ^ if that shall be literal. But point 2 doesn't mention the same for a trailing '$', though that seems also implementation dependant.

Do you think anything should be done about that?

geoffclare

2022-04-08 08:51

manager   bugnote:0005789

Closing as withdrawn, as per the submitter's request in 0001575:0005787

Issue History

Date Modified Username Field Change
2022-04-05 01:15 calestyo New Issue
2022-04-05 01:15 calestyo Name => Christoph Anton Mitterer
2022-04-05 01:15 calestyo Section => 9.3.5 RE Bracket Expression
2022-04-05 01:15 calestyo Page Number => 169
2022-04-05 01:15 calestyo Line Number => 5872
2022-04-05 14:32 geoffclare Note Added: 0005782
2022-04-07 22:55 calestyo Note Added: 0005787
2022-04-08 08:51 geoffclare Note Added: 0005789
2022-04-08 08:51 geoffclare Status New => Closed
2022-04-08 08:51 geoffclare Resolution Open => Withdrawn