Austin Group Defect Tracker

Aardvark Mark IV


Viewing Issue Simple Details Jump to Notes ] Issue History ] Print ]
ID Category Severity Type Date Submitted Last Update
0001662 [Issue 8 drafts] Shell and Utilities Objection Error 2023-04-11 14:28 2023-06-27 15:08
Reporter geoffclare View Status public  
Assigned To
Priority normal Resolution Accepted  
Status Applied   Product Version Draft 3
Name Geoff Clare
Organization The Open Group
User Reference
Section ed, ex
Page Number 2798, 2802, 2805, 2837, 2846, 2854, 2855
Line Number 92748, 92937, 92954, 93049, 93054, 94315, 94669, 94995, 95009, 95022
Final Accepted Text
Summary 0001662: Delimiter issues in ed and ex
Description The sed delimiter issues from bugs 0001550 and 0001551 also affect ed and ex.

With: g.a\.b.p

the "\." is treated as a literal "." in all versions of ed I tried and in all versions of ex/vi I tried except nvi where the "." is treated as special.

With: g.a[.]b.p

the "." in the bracket expression is not a delimiter in all versions of ed I tried and in all versions of ex/vi I tried except nvi where it is a delimiter (producing "brackets ([ ]) not balanced").

With: s.a\.b.x.

the "\." is treated as a literal "." in all versions of ed I tried and in all versions of ex/vi I tried except nvi where the "." is treated as special.

With: s.a[.]b.x.

the "." in the bracket expression is not a delimiter in all versions of ed I tried and in all versions of ex/vi I tried except nvi where it is a delimiter.

With: s&a&x\&y&

the "\&" is treated as a literal "&" in all versions of ed I tried and in all versions of ex/vi I tried except nvi where the "&" is treated as special.

The nvi behaviour in all these cases is likely a bug in nvi, as it is intended to behave exactly the same as the original vi (except for added new features).

The proposed changes are adapted from the resolution of bug 0001550 but requiring one behaviour where for sed it is unspecified which of two behaviours occurs, on the assumption that the behaviour of nvi should be considered to be a bug; if we want to allow it, something closer to the new sed text will be needed.
Desired Action After page 2798 line 92748 section ed (Regular Expressions in ed), add a new paragraph:
The start and end of a regular expression (RE) are marked by a delimiter character (although in some circumstances the end delimiter can be omitted). In addresses, the delimiter is either <slash> or <question-mark>. In commands, other characters can be used as the delimiter, as specified in the description of the command. Within the RE (as an ed extension to the BRE syntax), the delimiter shall not terminate the RE if it is the second character of an escape sequence (see [xref to XBD 9.1]) and the escaped delimiter shall be treated as that literal character in the RE (losing any special meaning it would have had if it was not used as the delimiter and was not escaped). In addition, the delimiter character shall not terminate the RE when it appears within a bracket expression, and shall have its normal meaning in the bracket expression. For example, the command "g%[%]%p" is equivalent to "g/[%]/p", and the command "s-[0-9]--g" is equivalent to "s/[0-9]//g".

On page 2802 line 92937 section ed (g command), change:
Any character other than <space> or <newline> can be used instead of a <slash> to delimit the RE. Within the RE, the RE delimiter itself can be used as a literal character if it is preceded by a <backslash>.
to:
Any character other than <backslash>, <space>, or <newline> can be used instead of a <slash> to delimit the RE. Within the RE, in certain circumstances the RE delimiter can be used as a literal character; see [xref to Regular Expressions in ed].

On page 2802 line 92954 section ed (G command), change:
Any character other than <space> or <newline> can be used instead of a <slash> to delimit the RE and the replacement. Within the RE, the RE delimiter itself can be used as a literal character if it is preceded by a <backslash>.
to:
Any character other than <backslash>, <space>, or <newline> can be used instead of a <slash> to delimit the RE. Within the RE, in certain circumstances the RE delimiter can be used as a literal character; see [xref to Regular Expressions in ed].

On page 2805 line 93049 section ed (s command), change:
Any character other than <space> or <newline> can be used instead of a <slash> to delimit the RE and the replacement. Within the RE, the RE delimiter itself can be used as a literal character if it is preceded by a <backslash>.
to:
Any character other than <backslash>, <space>, or <newline> can be used instead of a <slash> to delimit the RE and the replacement. Within the RE, in certain circumstances the RE delimiter can be used as a literal character; see [xref to Regular Expressions in ed]. Within the replacement, the delimiter shall not terminate the replacement if it is the second character of an escape sequence (see [xref to XBD 9.1]) and the escaped delimiter shall be treated as that literal character in the replacement (losing any special meaning it would have had if it was not used as the delimiter and was not escaped).

On page 2805 line 93054 section ed (s command), change:
An <ampersand> ('&') appearing in the replacement shall be replaced by the string matching the RE on the current line. The special meaning of '&' in this context can be suppressed by preceding it by <backslash>. As a more general feature, the characters '\n', where n is a digit, shall be replaced by the text matched by the corresponding back-reference expression. If the corresponding back-reference expression does not match, then the characters '\n' shall be replaced by the empty string. When the character '%' is the only character in the replacement, the replacement used in the most recent substitute command shall be used as the replacement in the current substitute command; if there was no previous substitute command, the use of '%' in this manner shall be an error. The '%' shall lose its special meaning when it is in a replacement string of more than one character or is preceded by a <backslash>. For each <backslash> encountered in scanning replacement from beginning to end, the following character shall lose its special meaning (if any). It is unspecified what special meaning is given to any character other than <backslash>, '&', '%', or digits.
to:
An unescaped <ampersand> ('&') appearing in the replacement shall be replaced by the string matching the RE on the current line. As a more general feature, the characters '\n', where the <backslash> is unescaped and n is a digit, shall be replaced by the text matched by the corresponding back-reference expression. If the corresponding back-reference expression does not match, then the characters '\n' shall be replaced by the empty string. When the character '%' is the only character in replacement, the replacement used in the most recent substitute command shall be used as replacement in the current substitute command; if there was no previous substitute command, the use of '%' in this manner shall be an error. The '%' shall lose its special meaning when it is in a replacement string of more than one character or is escaped. It is unspecified what special meaning is given to any character other than <backslash>, '&', '%', or digits.

After page 2854 line 94995 section ex (Regular Expressions in ex), add a new paragraph:
The start and end of a regular expression (RE) are marked by a delimiter character (although in some circumstances the end delimiter can be omitted). In addresses, the delimiter is either <slash> or <question-mark>. In commands, other characters can be used as the delimiter, as specified in the description of the command. Within the RE (as an ex extension to the BRE syntax), the delimiter shall not terminate the RE if it is the second character of an escape sequence (see [xref to XBD 9.1]) and the escaped delimiter shall be treated as that literal character in the RE (losing any special meaning it would have had if it was not used as the delimiter and was not escaped). In addition, the delimiter character shall not terminate the RE when it appears within a bracket expression, and shall have its normal meaning in the bracket expression. For example, the command "g%[%]%p" is equivalent to "g/[%]/p", and the command "s-[0-9]--g" is equivalent to "s/[0-9]//g".

After page 2855 line 95009 section ex (Replacement Strings in ex), add a new paragraph:
Certain characters and strings have special meaning in replacement strings when the character, or the first character of the string, is unescaped.

On page 2855 line 95022 section ex (Replacement Strings in ex), change:
Otherwise, any character following a <backslash> shall be treated ...
to:
Otherwise, any character following an unescaped <backslash> shall be treated ...

On page 2837 line 94315 section ex (g command), after:
The pattern can be delimited by <slash> characters (shown in the Synopsis), as well as any non-alphanumeric or non-<blank> other than <backslash>, <vertical-line>, <newline>, or double-quote.
add:
Within the pattern, in certain circumstances the delimiter can be used as a literal character; see [xref to Regular Expressions in ex].

On page 2846 line 94669 section ex (s command), change:
Any non-alphabetic, non-<blank> delimiter other than <backslash>, '|', <newline>, or double-quote can be used instead of '/'. <backslash> characters can be used to escape delimiters, <backslash> characters, and other special characters.
to:
Any non-alphabetic, non-<blank> delimiter other than <backslash>, '|', <newline>, or double-quote can be used instead of '/'. Within the pattern, in certain circumstances the delimiter can be used as a literal character; see [xref to Regular Expressions in ex]. Within the replacement, the delimiter shall not terminate the replacement if it is the second character of an escape sequence (see [xref to XBD 9.1]) and the escaped delimiter shall be treated as that literal character in the replacement (losing any special meaning it would have had if it was not used as the delimiter and was not escaped).

Tags applied_after_i8d3, issue8
Attached Files

- Relationships
related to 0001550Applied clarifications/ambiguities in the description of context addresses and their delimiters for sed 
related to 0001551Closed sed: ambiguities in the how BREs/EREs are parsed/interpreted between delimiters (especially when these are special characters) 

There are no notes attached to this issue.

- Issue History
Date Modified Username Field Change
2023-04-11 14:28 geoffclare New Issue
2023-04-11 14:28 geoffclare Name => Geoff Clare
2023-04-11 14:28 geoffclare Organization => The Open Group
2023-04-11 14:28 geoffclare Section => ed, ex
2023-04-11 14:28 geoffclare Page Number => 2798, 2802, 2805, 2837, 2846, 2854, 2855
2023-04-11 14:28 geoffclare Line Number => 92748, 92937, 92954, 93049, 93054, 94315, 94669, 94995, 95009, 95022
2023-04-11 14:30 geoffclare Relationship added related to 0001550
2023-04-11 14:30 geoffclare Relationship added related to 0001551
2023-06-08 15:17 Don Cragun Status New => Applied
2023-06-08 15:17 Don Cragun Resolution Open => Accepted
2023-06-08 15:18 Don Cragun Tag Attached: issue8
2023-06-08 15:39 nick Status Applied => Resolved
2023-06-27 15:08 geoffclare Status Resolved => Applied
2023-06-27 15:09 geoffclare Tag Attached: applied_after_i8d3


Mantis 1.1.6[^]
Copyright © 2000 - 2008 Mantis Group
Powered by Mantis Bugtracker