Austin Group Defect Tracker

Aardvark Mark IV


Viewing Issue Simple Details Jump to Notes ] Issue History ] Print ]
ID Category Severity Type Date Submitted Last Update
0001072 [1003.1(2013)/Issue7+TC1] Shell and Utilities Comment Clarification Requested 2016-08-27 01:18 2019-10-28 10:31
Reporter quinngrier View Status public  
Assigned To
Priority normal Resolution Accepted As Marked  
Status Applied  
Name Quinn Grier
Organization
User Reference
Section m4 (utility)
Page Number 2899, 2901
Line Number 95625, 95733
Interp Status Approved
Final Accepted Text see Note: 0003919
Summary 0001072: m4 ifelse argument expansion clarification
Description

The description of the m4 ifelse macro includes the following sentence:

If the first two arguments compare as equal strings (after macro expansion of both arguments), the defining text shall be the third argument.

The parenthetical remark is presumably intended to be a reminder that all arguments are expanded as they are collected, but it can also be interpreted to mean that these two particular arguments are expanded a second time. It is difficult to refute this interpretation because this section provides little explanation of how arguments are expanded and generally uses the word "argument" to refer to an argument that has already been expanded as it was collected.

To illustrate, consider the following example:

define(`foo', `bad')dnl
define(`bar', `bad')dnl
ifelse(foo, bar, `hello')`'dnl
ifelse(`foo', `bar', ` world')`'dnl
ifelse(``foo'', ``bar'', ` never')

It is clear that the first comparison is true and the third is false, but what about the second? Should the output of this example be "hello", or should it be "hello world"? The expectation is that it should be "hello", but the parenthetical remark raises the concern that it may be "hello world". The output is indeed "hello" on several implementations: GNU M4 1.4.17, FreeBSD 10.3, and OpenIndiana Hipster 2016.04.

Desired Action

On line 95625, add the following sentence to the end of the paragraph to provide some explanation of how arguments are expanded:

Macro expansion is performed on the arguments as they are collected.

On line 95733, remove the following parenthetical remark to maintain consistency with the general use of the word "argument" to refer to an argument that has already been expanded as it was collected:

(after macro expansion of both arguments)
Tags tc3-2008
Attached Files

- Relationships

-  Notes
(0003366)
quinngrier (reporter)
2016-08-29 03:01

The formatting in my original example didn't work properly. I used br elements instead of newline characters inside the pre element to try to ensure that double spacing would not occur from newline characters possibly being decorated with br elements, but the br elements ended up being removed, joining the lines together. Here's the example again, this time with newline characters:

define(`foo', `baz')dnl
define(`bar', `baz')dnl
ifelse(foo, bar, `hello')`'dnl
ifelse(`foo', `bar', ` world')`'dnl
ifelse(``foo'', ``bar'', ` never')
(0003367)
Don Cragun (manager)
2016-08-29 03:28

"br" tags in Description replaced by literal <newline> characters (as suggested in Note: 0003366).
(0003919)
eblake (manager)
2018-02-09 14:47
edited on: 2018-02-09 15:02

Interpretation response
------------------------

The standard is unclear on this issue, and no conformance distinction can be made between alternative implementations based on this. This is being referred to the sponsor.

Rationale:
-------------
All known m4 implementations perform macro expansion and argument collection in the same manner; but the wording used in the standard did not give enough details on how this occurs.

Notes to the Editor (not part of this interpretation):
-------------------------------------------------------
All page and line numbers are for the 2016 edition.

On page 2934 lines 97081-97090 change:
The m4 utility shall compare each token from the input against the set of built-in and user-defined macros. If the token matches the name of a macro, then the token shall be replaced by the macro’s defining text, if any, and rescanned for matching macro names. Once no portion of the token matches the name of a macro, it shall be written to standard output. Macros may have arguments, in which case the arguments shall be substituted into the defining text before it is rescanned.

Macro calls have the form:

name(arg1, arg2, ..., argn)

Macro names shall consist of letters, digits, and underscores, where the first character is not a digit. Tokens not of this form shall not be treated as macros.
to:
The m4 utility shall compare each token from the input against the set of built-in and user-defined macros. If the token matches the name of a macro, then the token shall be replaced by the macro’s defining text, if any, then scanning for tokens shall resume at the start of the macro's defining text concatenated with the subsequent input. If a token does not match the name of a macro, it shall be written to standard output. Macros may have arguments, in which case the arguments shall be substituted into the defining text before it is rescanned.

No special meaning shall be given to characters enclosed between matching left and right quoting strings, other than identifying nested quoting while finding the matching right quoting string, but the outermost quoting strings shall themselves be discarded. By default, the left quoting string consists of a grave accent (backquote) and the right quoting string consists of an acute accent (single-quote); see also the changequote macro.

Comments are written but not scanned for matching macro names; by default, the begin-comment string consists of the <number-sign> character and the end-comment string consists of a <newline>. See also the changecom and dnl macros.

Name tokens shall consist of the longest possible sequence of letters, digits, and underscores, where the first character is not a digit. Tokens not of this form shall not be treated as name tokens. A macro call is a name token that matches the name of a built-in or user-defined macro. Macro calls can have either of the following forms, which shall be distinguished by whether or not the macro name is immediately followed by a <left-parenthesis>:

    name

    name(arg1, arg2, ..., argn)


On page 2934 lines 97094-97106 change:
If a macro name is followed by a <left-parenthesis>, its arguments are the <comma>-separated tokens between the <left-parenthesis> and the matching <right-parenthesis>. Unquoted white-space characters preceding each argument shall be ignored. All other characters, including trailing white-space characters, are retained. <comma> characters enclosed between <left-parenthesis> and <right-parenthesis> characters do not delimit arguments.

Arguments are positionally defined and referenced. The string "<tt>$1</tt>" in the defining text shall be replaced by the first argument. Systems shall support at least nine arguments; only the first nine can be referenced, using the strings "<tt>$1</tt>" to "<tt>$9</tt>", inclusive. The string "<tt>$0</tt>" is replaced with the name of the macro. The string "<tt>$#</tt>" is replaced by the number of arguments as a string. The string "<tt>$*</tt>" is replaced by a list of all of the arguments, separated by <comma> characters. The string "<tt>$@</tt>" is replaced by a list of all of the arguments separated by <comma> characters, and each argument is quoted using the current left and right quoting strings. The string "<tt>${</tt>" produces unspecified behavior.
to:
If a macro name is followed by a <left-parenthesis>, the subsequent text shall be tokenized and expanded until a token is encountered that is not a quoted string and whose expansion includes a matching unquoted <right-parenthesis>. The expanded text between the <left-parenthesis> and the matching unquoted <right-parenthesis> is the macro's argument text. An unquoted <comma> character within the macro's argument text shall mark the end of one argument and the beginning of the next argument unless the unquoted <comma> is enclosed within a nested unquoted <left-parenthesis>, <right-parenthesis> pair. The unquoted <comma> characters that separate the arguments, and any unquoted whitespace characters at the beginning of each argument, shall be discarded. All other characters in the macro's argument text, including any whitespace characters at the end of an argument and any nested parenthesized text, shall be retained. The input text containing the macro name, the following <left-parenthesis>, and all tokens up to and including the token whose expansion contained the matching unquoted <right-parenthesis> shall be replaced, and tokenization shall resume on the result of performing argument substition on the macro's defining text followed by any expanded text that followed the matching unquoted <right-parenthesis>. Otherwise, the macro name was not followed by a <left-parenthesis>, and tokenization shall resume on the result of performing argument substitution with zero arguments on the macro's defining text.

During argument substitution, arguments shall be positionally defined and referenced. The string "<tt>$1</tt>" in the defining text shall be replaced by the first argument. Systems shall support at least nine arguments; only the first nine can be referenced, using the strings "<tt>$1</tt>" to "<tt>$9</tt>", inclusive. The string "<tt>$0</tt>" shall be replaced with the name of the macro. The string "<tt>$#</tt>" shall be replaced by the number of arguments as a minimal string of decimal digits ('0' if the macro was invoked without being followed by a <left-parenthesis>, otherwise 1 more than the number of unquoted <comma> characters that divided arguments in the macro's argument text). The string "<tt>$*</tt>" shall be replaced by a list of all of the arguments, separated by <comma> characters. The string "<tt>$@</tt>" shall be replaced by a list of all of the arguments separated by <comma> characters, and each argument shall be quoted using the current left and right quoting strings. The string "<tt>${</tt>" produces unspecified behavior.


On page 2934, delete lines 97110-97116:
No special meaning is given to any characters enclosed between matching left and right quoting strings, but the quoting strings are themselves discarded. By default, the left quoting string consists of a grave accent (backquote) and the right quoting string consists of an acute accent (single-quote); see also the changequote macro.

Comments are written but not scanned for matching macro names; by default, the begin-comment string consists of the <number-sign> character and the end-comment string consists of a <newline>. See also the changecom and dnl macros.
(as these lines were moved and reworded in the earlier changeset at line 97081)

On page 2935 line 97129 (changecom), change:
The behavior is unspecified if either argument is provided but null.
to:
The behavior is unspecified if either argument is provided but null, or if either argument includes letters, digits, underscore, or <left-parenthesis>.


On page 2935 line 97133 (changequote), change:
The behavior is unspecified if there is a single argument or either argument is null.
to:
The behavior is unspecified if there is a single argument, or if either argument is null or includes letters, digits, underscore, or <left-parenthesis>.


On page 2936 line 97206 (ifelse), remove the following parenthetical remark:
(after macro expansion of both arguments)


On page 2940, in the EXAMPLES section, after line 97369, add the following (note to editor: feel free to use bullets or other list identifiers instead of numbers, if that is better):
In the following six examples, an additional line is evaluated after this prologue of three definitions:
define(`macro', `argument 2 is :`$2':, called with $# arguments')dnl
define(`argumentsa', `Arguments')dnl
define(`a', `.')dnl


1. The additional line:
 
macro`'a

produces:
 
argument 2 is ::, called with 0 arguments.

Explanation: macro is called with 0 arguments (as shown by the $# substitution), the substitution of $2 is the empty string, and the empty quoted string after the expansion text prevents concatenation with the subsequent "a", which in turn lets macro a expand to the final <tt>.</tt>.

2. The additional line:
 
macro()a

produces:
 
argument 2 is ::, called with 1 Arguments

Explanation: macro is called with one (empty string) argument; then the defining text ending in "arguments" is concatenated with the subsequent "a" to form the next macro name argumentsa which is expanded into <tt>Arguments</tt> before the final output.

3. The additional line:
 
macro(  1, ( ,2,) ,  `3')

produces:
 
argument 2 is :( ,2,) :, called with 3 arguments

Explanation: Leading (but not internal or trailing) space is removed before the argument substituted for $2, and the unquoted commas embedded in parentheses do not delineate arguments.

4. The additional line:
 
macro(  `1', `mac2(,`2',)', `3')

produces:
 
argument 2 is :mac2(,`2',):, called with 3 arguments

Explanation: Regardless of whether mac2 is a defined macro, quoting in the macro call prevents interpretation of "mac2" during argument collection, and the quoting in the defining text of macro prevents interpretation of "mac2" in the substitution of $2 during rescan of the output of macro.

5. The additional line:
 
undefine(`mac2')macro(  1, mac2(,2,), 3)

produces:
 
argument 2 is :mac2(,2,):, called with 3 arguments

Explanation: mac2 is not a macro name when scanned during argument collection, so it and the subsequent parenthesized text is used literally.

6. The additional line:
 
define(`mac2', `hi $@')macro(  1, mac2( ,2,), 3)

produces:
 
argument 2 is :hi :, called with 5 arguments

Explanation: mac2 is a macro name, so collecting the arguments to macro requires scanning the output of mac2(,2,) (the text <tt>hi `',`2',`'</tt> after substitution of $@); this output contains unquoted commas causing additional arguments to be visible to macro.


(0004142)
ajosey (manager)
2018-09-30 18:41

Interpretation Proposed: 30 September 2018
(0004164)
ajosey (manager)
2018-11-12 19:47

Interpretation approved: 12 November 2018

- Issue History
Date Modified Username Field Change
2016-08-27 01:18 quinngrier New Issue
2016-08-27 01:18 quinngrier Name => Quinn Grier
2016-08-27 01:18 quinngrier Section => m4 (utility)
2016-08-27 01:18 quinngrier Page Number => 2899, 2901
2016-08-27 01:18 quinngrier Line Number => 95625, 95733
2016-08-29 03:01 quinngrier Note Added: 0003366
2016-08-29 03:28 Don Cragun Interp Status => ---
2016-08-29 03:28 Don Cragun Note Added: 0003367
2016-08-29 03:28 Don Cragun Description Updated
2016-08-29 03:29 Don Cragun Description Updated
2018-02-09 14:47 eblake Note Added: 0003919
2018-02-09 14:47 eblake Note Edited: 0003919
2018-02-09 14:50 eblake Note Edited: 0003919
2018-02-09 14:51 eblake Note Edited: 0003919
2018-02-09 14:55 eblake Note Edited: 0003919
2018-02-09 15:00 eblake Note Edited: 0003919
2018-02-09 15:02 eblake Note Edited: 0003919
2018-02-09 15:03 eblake Interp Status --- => Pending
2018-02-09 15:03 eblake Final Accepted Text => see Note: 0003919
2018-02-09 15:03 eblake Status New => Interpretation Required
2018-02-09 15:03 eblake Resolution Open => Accepted As Marked
2018-02-09 15:31 geoffclare Tag Attached: tc3-2008
2018-09-30 18:41 ajosey Interp Status Pending => Proposed
2018-09-30 18:41 ajosey Note Added: 0004142
2018-11-12 19:47 ajosey Interp Status Proposed => Approved
2018-11-12 19:47 ajosey Note Added: 0004164
2019-10-28 10:31 geoffclare Status Interpretation Required => Applied


Mantis 1.1.6[^]
Copyright © 2000 - 2008 Mantis Group
Powered by Mantis Bugtracker