Austin Group Defect Tracker

Aardvark Mark IV


Viewing Issue Simple Details Jump to Notes ] Issue History ] Print ]
ID Category Severity Type Date Submitted Last Update
0001098 [1003.1(2016/18)/Issue7+TC2] Shell and Utilities Editorial Error 2016-10-20 16:56 2018-04-12 15:41
Reporter Mark_Galeck View Status public  
Assigned To
Priority normal Resolution Withdrawn  
Status Closed  
Name Mark Galeck
Organization
User Reference
Section 2.10.2 Shell Grammar Rules
Page Number 2379
Line Number 76091
Interp Status ---
Final Accepted Text
Summary 0001098: do_group symbol cannot be accepted as written, because rule 6 cannot yield Done token
Description Rule 6 can only yield Do, In, or Word, not Done, so the production

do_group: Do compound_list Done /* Apply rule 6 */

cannot succeed. Rule 1 would work (but is needlessly complicated for the task at hand), but since rule 6 overrules rule 1, rule 1 does not apply.
Desired Action Change the production to read:

do_group: Do compound_list Done /* For each reserved word, apply Rule 10 for that word */

Add rule 10:

10. This rule is different for each reserved word. When TOKEN is the reserved word, the corresponding token identifier shall result. Otherwise, WORD shall result.
Tags No tags attached.
Attached Files

- Relationships
duplicate of 0001100Closed Rewrite of Section 2.10 Shell Grammar, of the Shell Standard, to fix previous reports, fix new issues, and improve presentation. 

-  Notes
(0003448)
geoffclare (manager)
2016-10-21 09:23
edited on: 2016-10-21 09:41

Rule 6 only applies to the third word of a for (or case) command. In a properly constructed do_group, "done" is never the third word of the command, so rule 6 is not applied to that token.

(0003449)
Mark_Galeck (reporter)
2016-10-21 21:56

Correct. I already said that in my report. You are just restating what I already said.

What then IS applied to the done token? Rule 1? Can't - because, only ONE rule can be applied to a given production, according to the text before the rules, and, Rule 6 is being applied (it has precedence over Rule 1 since it is higher number).
(0003450)
geoffclare (manager)
2016-10-22 08:00

I repeat: rule 6 is not applied to that token.
(0003451)
Mark_Galeck (reporter)
2016-10-22 08:22

Geoff, with all due respect you are not reading what I am writing. I am saying "nothing applies to x, is that not true?", and your answer is "y does not apply to x".
(0003452)
geoffclare (manager)
2016-10-22 10:44

Your argument is that because rule 6 applies, rule 1 cannot be applied. I am pointing out that your assumption is false. Rule 6 does not apply, therefore rule 1 applies.
(0003453)
Mark_Galeck (reporter)
2016-10-22 12:01
edited on: 2016-10-22 12:24

It says:

"Some of the productions in the grammar below are annotated with a rule number from the following list. When a TOKEN is seen where one of those annotated productions could be used to reduce the symbol, the applicable rule shall be applied to convert the token identifier type of the TOKEN to a token identifier acceptable at that point in the grammar".

The production is:

do_group : Do compound_list Done /* Apply rule 6 */

When the token "do" is seen, the production cannot (yet) be used to reduce the symbol "do_group". Only when the token "done" is seen, the production "could" be used. At that time "the" token is "done", and we apply rule 6 to it. Rule 6 yields "WORD" and the token stack reads

WORD
compound_list
do

and the production does not apply.


The comment "third word of for and case" does not make sense to begin with, because "do" need not be the third word. Besides, it is not an imperative sentence, such as "this rule shall apply only to the third word...". Besides, it is in brackets, and in standards language, no actual rule to be followed can be in brackets or parenthesis, only commentary. So I ignored that comment, nothing much else to do with it.

(0003454)
Mark_Galeck (reporter)
2016-10-22 12:18
edited on: 2016-10-22 12:20

Maybe it is the meaning of the phrase "applicable rule" in the first paragraph I quoted.

From that first paragraph, when one reads it without preconceptions, it is clear, that the phrase "applicable rule", means "the one mentioned in the preceding sentence, the one that the production is annotated with".

I invite you to find a coworker without any knowledge of this standard, have them read that paragraph, and ask them - what do you think is the meaning of "applicable rule" is.

(0003455)
Mark_Galeck (reporter)
2016-10-22 12:29

Actually, I don't want to waste any more of your valuable time Geoff. If you think this report does not make sense, reject it. I don't mind. I did the right thing, by reporting this, that is all I care about.

Thank you and I am sorry to take your time.
(0003457)
shware_systems (reporter)
2016-10-25 23:16

Maybe this be clearer:

The do_group production has 3 elements, the 2 keyword tokens and the list production... Rule 6 possibly applies to just the first token, the Do, and Rule 1 applies to the Done token, as the 3rd and 5th elements of the full for_clause respectively when the in clause is not present (so the default "in $@" applies). When an in clause is present Rule 1 applies to both keyword tokens because as the Do token is no longer the 3rd WORD, In is, the unmet qualifying clause of Rule 6's "third word" condition leaves Rule 1 as the default that does apply. For the 4 choices of the for_clause the linebreak and sequential_sep references are significant whitespace for counting purposes, not operator tokens, so Rule 6b applies to the first 2 with do_group and the in production in the second 2. Rule 6 is necessary to force 'for' as unquoted text being treated as a keyword and not a simple command with arguments, in the absence of a sequential_sep as 'best/longest matching production' to satisfy the command production. A similar ambiguity exists for 'in' in the case_clause production because of the empty portion of the linebreak production. This isn't obvious if one is thinking the grammar matches yacc or another standard's production style.
 
In the rules the square brackets are decorative, just grouping a portion of the text to indicate narratively where in the productions the rule applies, as forward references. No one has made them hyperlinks in the pdf or html, that's all, but they're not commentary.

The grammar is more 'best match' than 'first match', for h(yste)istorical reasons, as the for construct and others were originally implemented as external commands that could be loaded as an extension overlay into the resident image of shells that only handled simple command parsing and redirections; a special non-built-in, effectively, for low RAM machines ( See https://en.wikipedia.org/wiki/Thompson_shell [^] ). When shells like the Bourne shell could keep them resident, because more processors had the luxury of 16-bit (or more) address buses to shoehorn code into, and added features that required treatment grammatically other than as arguments the necessary productions were added to existing text, I believe, not reworked to match a particular meta-language's translation requirements/restrictions. Rule 1 applies in the command production to have the productions involving keywords or operators being best matches as priority over simple_command, iow.

Something like the above may be suitable for adding to XRAT C.2.10, as a historical note, but in XCU 2.10 I don't see that changes are required. What's there is more a meta-meta-language, partly top down, partly bottom up, but with the rules is internally consistent.

Example:
for varname do ( ) done<NL>
parses as a for loop, with '( )' representing a valid subshell compound-list that does nothing, and fully matches for_clause choice 1;

for varname do \() done<NL>
parses as a simple command, 'for'; with 'varname', 'do', '()', and 'done' as textual arguments,
as '\()' is an invalid compound-list that forces backtracking to simple_command as best match.

\for varname do ( ) done<NL>
also parses as a simple command, but with '(' and ')' as separate arguments, due to the escaping '\' disabling keyword recognition.
(0003458)
Mark_Galeck (reporter)
2016-10-26 16:32
edited on: 2016-10-26 16:51

>Rule 6 is necessary to force 'for' as unquoted text being treated as a keyword and not a simple command with arguments, in the absence of a sequential_sep as 'best/longest matching production' to satisfy the command production.


It could be that I already know what you are talking about, but can't tell because I don't understand this sentence.

QUESTION. Since you write "necessary", can you give me an example involving "for", that if we dropped rule 6 and just applied default rule 1, would behave differently compared to the current standard?

The examples you give in your note do not work at all, see below.



>A similar ambiguity exists for 'in'

QUESTION. Likewise, please, an example involving "in".


I am not trying to argue with you, I just want to see if there is anything in what you wrote, that I don't already know.



>This isn't obvious if one is thinking the grammar matches yacc or another standard's production style.


As Geoff pointed out, the Introduction to Shell & Utilities, says

"The grammar is based on the syntax used by the yacc utility."

QUESTION. Are you contradicting that? Please explain.


I am working under the assumption, that the the Shell Standard describes how parsing is done in terms of the Grammar given in section 2.10.2, as if by yacc, provided, that the lexer is "somehow" able to implement the rules from section 2.10.1 and return the correct token ID (not TOKEN).

You seem to be talking about some "best/longest matching production" and "backtracking", none of these things are mentioned anywhere in the Standard (as far as I can tell) and, these are also not how how the yacc/bison works.

QUESTION. Is my working assumption above wrong and if so, what do you mean by the terms "best/longest matching production", "backtracking".



> for varname do ( ) done<NL>
parses as a for loop, with '( )' representing a valid subshell compound-list that does nothing, and fully matches for_clause choice 1


No it doesn't. '( )', with or without the space inside is invalid subshell.
Both dash and bash do not accept it.


> for varname do \() done<NL>
parses as a simple command, 'for'; with 'varname', 'do', '()', and 'done' as textual arguments,
as '\()' is an invalid compound-list that forces backtracking to simple_command as best match


No it doesn't, for multiple reasons:

* 'for' as the first token will be treated as reserved word, regardless of whether the remaining tokens do or not form valid 'for' loop sequence. That is because, in order for it to parse as a simple command, 'for' would have to be WORD in the cmd_name production, and rule 7a applies, which says, 'for' is For, not WORD.

* Even if 'for' were a command name (which is not), \() is not one WORD token, but two tokens, one \( and then second ), and the second token is the ')' operator, so the whole thing would still not be one simple command , but it would be invalid use of ')'.

As before, check either dash or bash, they show what I wrote above.


> \for varname do ( ) done<NL>
also parses as a simple command, but with '(' and ')' as separate arguments, due to the escaping '\' disabling keyword recognition.


No it doesn't. Here '\for' is command OK, but both ( and ) are operators, regardless of space between them, and they are unexpected.

Again, bash and dash do not accept this.

(0003466)
Mark_Galeck (reporter)
2016-10-27 12:45

I do want shware_systems to answer my Questions in my last note. After that, this report can be cancelled.

- Issue History
Date Modified Username Field Change
2016-10-20 16:56 Mark_Galeck New Issue
2016-10-20 16:56 Mark_Galeck Name => Mark Galeck
2016-10-20 16:56 Mark_Galeck Section => 2.10.2 Shell Grammar Rules
2016-10-20 16:56 Mark_Galeck Page Number => 2379
2016-10-20 16:56 Mark_Galeck Line Number => 76091
2016-10-21 09:23 geoffclare Note Added: 0003448
2016-10-21 09:24 geoffclare Note Edited: 0003448
2016-10-21 09:41 geoffclare Note Edited: 0003448
2016-10-21 21:56 Mark_Galeck Note Added: 0003449
2016-10-22 08:00 geoffclare Note Added: 0003450
2016-10-22 08:22 Mark_Galeck Note Added: 0003451
2016-10-22 10:44 geoffclare Note Added: 0003452
2016-10-22 12:01 Mark_Galeck Note Added: 0003453
2016-10-22 12:18 Mark_Galeck Note Added: 0003454
2016-10-22 12:19 Mark_Galeck Note Edited: 0003454
2016-10-22 12:19 Mark_Galeck Note Edited: 0003454
2016-10-22 12:20 Mark_Galeck Note Edited: 0003454
2016-10-22 12:24 Mark_Galeck Note Edited: 0003453
2016-10-22 12:29 Mark_Galeck Note Added: 0003455
2016-10-25 23:16 shware_systems Note Added: 0003457
2016-10-26 16:32 Mark_Galeck Note Added: 0003458
2016-10-26 16:39 Mark_Galeck Note Edited: 0003458
2016-10-26 16:44 Mark_Galeck Note Edited: 0003458
2016-10-26 16:46 Mark_Galeck Note Edited: 0003458
2016-10-26 16:51 Mark_Galeck Note Edited: 0003458
2016-10-27 12:45 Mark_Galeck Note Added: 0003466
2016-10-28 08:22 geoffclare Relationship added related to 0001100
2018-04-12 15:40 eblake Relationship replaced duplicate of 0001100
2018-04-12 15:41 Don Cragun Interp Status => ---
2018-04-12 15:41 Don Cragun Status New => Closed
2018-04-12 15:41 Don Cragun Resolution Open => Withdrawn


Mantis 1.1.6[^]
Copyright © 2000 - 2008 Mantis Group
Powered by Mantis Bugtracker