Austin Group Defect Tracker

Aardvark Mark IV


Viewing Issue Simple Details Jump to Notes ] Issue History ] Print ]
ID Category Severity Type Date Submitted Last Update
0001454 [1003.1(2016/18)/Issue7+TC2] Shell and Utilities Comment Error 2021-02-16 16:29 2021-02-25 09:41
Reporter geoffclare View Status public  
Assigned To
Priority normal Resolution Open  
Status New  
Name Geoff Clare
Organization The Open Group
User Reference
Section 2.9.4.3
Page Number 2372
Line Number 75780
Interp Status ---
Final Accepted Text
Summary 0001454: Conflict between "case" description and grammar
Description A discussion in comp.unix.shell identified that the format summary in the description of "case" conflicts with the syntax defined in the formal shell grammar, as it doesn't allow a case statement with no patterns:

case foo in
esac

whereas the grammar has a line whose sole purpose is to allow it:

Case WORD linebreak in linebreak Esac

The grammar has precedence, so the description should be updated to match it.

All shells I tried accepted a case statement with no patterns, although ksh93 reports a syntax error if the newline is omitted (case foo in esac). However, this appears to be a bug introduced into ksh93, as ksh88 accepts it. It is clear from the grammar that having a newline there should make no difference, since "linebreak" is zero or more newlines.
Desired Action On page 2372 line 74408 section 2.9.4.3, change:
the first one of several patterns
to:
the first one of zero or more patterns

On page 2372 line 75780 section 2.9.4.3, delete the line:
[(] pattern1 ) compound-list ;;

Note to the editor: in Issue 8 draft 1.1 the line to delete has "terminator" instead of ";;" because of the change to allow ";&".
Tags No tags attached.
Attached Files

- Relationships

-  Notes
(0005241)
kre (reporter)
2021-02-16 20:44

I agree that shells (and my testing also says all but ksh93) work this
way, and have always done so, and consequently as the grammar appears to
allow, should continue working this way .. but I suspect the ksh93 variation
might just be deliberate, rather than simply a bug, in that they might be
attempting to follow the POSIX standard's rules (words), rather than the
actual standard (what everyone (else) implements).

The issue is that I see nothing in the standard currently which allows that
"esac" to be parsed as the Esac (reserved word) token, or not when it is
not following a \n or ; (etc), or a pattern. That is, reserved words
generally (with the specific exceptions called out by references to the
XCU 2.10.2 rules in the grammar) are generally recognised only when in
the command word position (if the thing parsed were to be a simple command).

In "case foo in x) stuff ..." "stuff" is in the command word position, so
a reserved word lookup happens.

On the other hand, in "case foo in stuff..." "stuff" is not in the command
word position (it will be taken as the pattern, and should be followed by
')' or '|') and so is not subject to reserved word lookup. This is what
makes "case word in for) command;; esac" legal, "for" there is not the
reserved word, just a string. All shells accept that (ksh93 included).

But if "for" there is not a reserved word, how could it be if spelled "esac"
instead? Shells do reject "case word in esac) command;; esac", that is,
except for ksh93, which permits it.

So, I while I believe that we should ask ksh93 to follow the real standard,
rather than what is currently in the POSIX text, I also believe that we need
to add some more magic to the grammar, and tokeniser rules, in order to
make this all legitimate.

It isn't as simple as just updating the description in 2.9.4.3.
(0005242)
kre (reporter)
2021-02-16 20:55
edited on: 2021-02-16 21:02

Ignore most of that (Note: 0005241) - I missed rule 4.

However, rule 4 only applies when looking at a pattern. That is
in a case_list (or case_list_ns) as a case_item (or case_item_ns).

The grammar rule in question:
      Case WORD linebreak in linebreak Esac
contains no case_list, hence no patterns, hence rule 4 would seem
not to apply.

The other rules always require a case_list[_ns] which always requires
at least one case_item[_ns] and those things always require a ')'
(every single possibility).

So, I still believe that the grammar needs work, even if slightly different
work than I expected in Note: 0005241 . It may be as simple as adding
/* Apply rule 4 */ to the grammar rule line quoted earlier in this note,
but I haven't considered all the ramifications of that yet.

(0005243)
geoffclare (manager)
2021-02-17 10:07

I agree that we should add /* Apply rule 4 */ to that line in the grammar.
Or something technically equivalent - there would be an editorial problem with simply adding it (and removing redundant spaces) because:

Case WORD linebreak in linebreak Esac /* Apply rule 4 */

is two characters too long to fit on the line.

Perhaps we could change all occurrences of "Apply rule X" to just "Rule X"?
The comments only need to make clear which rule it is that applies; it is the wording in 2.10.1 about those comments that specifies how the indicated rules are applied. (The comment that says "Do not apply rule 4" would stay as-is.)
(0005244)
kre (reporter)
2021-02-17 13:18

Unfortunately, upon reflection, it is not quite that simple (ignoring
temporarily the editorial issue, for which just using "Rule N" instead
of "Apply..." would be fine) as rule 4 says that if the word is "esac"
then the Esac token is returned. If that applies to the grammar production
in question, then
     case esac in esac
isn't going to parse correctly, as the first "esac" which should be WORD
would instead become Esac and so not match.

Perhaps instead, change all occurrences of :"Esac": (in all the productions)
to "case_end", and add a new production:

     case_end: Esac ; /* Apply Rule 4 */

(formatted however is appropriate) which also conveniently side-steps the
editorial issue.

But please consider this carefully, it is a spur of the moment suggestion,
I'm not sure if it might cause other issues.
(0005245)
geoffclare (manager)
2021-02-17 14:26

Re Note: 0005244 I wondered about "case esac in ..." as well when I was writing my previous note, but decided it's not a problem because of this text in 2.10.1:
Some of the productions in the grammar below are annotated with a rule number from the following list. When a TOKEN is seen where one of those annotated productions could be used to reduce the symbol, the applicable rule shall be applied to convert the token identifier type of the TOKEN to a token identifier acceptable at that point in the grammar.

So rule 4 only causes "esac" to be recognised as the token Esac when it appears in the position of an Esac in the grammar, and it therefore doesn't apply to "case esac in ..." because the "esac" there is in the position of WORD in the grammar, not Esac.
(0005246)
kre (reporter)
2021-02-18 10:32

Re Note: 0005245

That raises an additional problem, as (not concerning the case productions
here) that isn't the way things actually work.

   "convert the token identifier type of the TOKEN to a token identifier
    acceptable at that point in the grammar."

isn't what (normally) happens. If it were the command

    do my work

would run the "do" command, as "do" (the reserved word) is not acceptable
at that point in tne grammar, so the TOKEN "do" should have been (according
to that text) turned into a WORD rather tha nthe keywoprd "do".

There isn't a shell around that behaves like that, it is in the command line
position, "do" matches the spelling of the reserved word, Rule 1 applies, the
reserved word is generated, despite not being "acceptable at that point in
the grammar":.

Once we have an explanation for why this analysis isn't right, or a fix for
that text in 2.10.1 we can revisit how to handle the recognition of "ecac"
in that peculiar case statement with no patterns.
(0005252)
geoffclare (manager)
2021-02-25 09:41

New proposed changes based on comments made here and on the mailing list ...

On page 2372 line 74408 section 2.9.4.3, change:
the first one of several patterns
to:
the first one of zero or more patterns

On page 2372 line 75780 section 2.9.4.3, delete the line:
[(] pattern1 ) compound-list ;;

Note to the editor: in Issue 8 draft 1.1 the line to delete has "terminator" instead of ";;" because of the change to allow ";&".

On page 2375 line 75888 section 2.10.1, change:
... convert the token identifier type of the TOKEN to a token identifier acceptable at that point in the grammar.
to:
... convert the token identifier type of the TOKEN to:
  • The token identifier of the recognized reserved word, for rule 1.

  • A token identifier acceptable at that point in the grammar, for all other rules.

On page 2379,2380 line 76041,76043,76084-76134 section 2.10.2, change:
/* Apply rule ...
to:
/* Rule ...

On page 2379 line 76058 section 2.10.2, change:
case_item_ns      :     pattern ')' linebreak
                  |     pattern ')' compound_list
                  | '(' pattern ')' linebreak
                  | '(' pattern ')' compound_list
                  ;
case_item         :     pattern ')' linebreak     DSEMI linebreak
                  |     pattern ')' compound_list DSEMI linebreak
                  | '(' pattern ')' linebreak     DSEMI linebreak
                  | '(' pattern ')' compound_list DSEMI linebreak
                  ;
pattern           :             WORD      /* Apply rule 4 */
                  | pattern '|' WORD      /* Do not apply rule 4 */
to:
case_item_ns      : pattern_list ')' linebreak
                  | pattern_list ')' compound_list
                  ;
case_item         : pattern_list ')' linebreak     DSEMI linebreak
                  | pattern_list ')' compound_list DSEMI linebreak
                  ;
pattern_list      :                  WORD /* Rule 4 */
                  |              '(' WORD /* Do not apply rule 4 */
                  | pattern_list '|' WORD /* Do not apply rule 4 */
                  ;

- Issue History
Date Modified Username Field Change
2021-02-16 16:29 geoffclare New Issue
2021-02-16 16:29 geoffclare Name => Geoff Clare
2021-02-16 16:29 geoffclare Organization => The Open Group
2021-02-16 16:29 geoffclare Section => 2.9.4.3
2021-02-16 16:29 geoffclare Page Number => 2372
2021-02-16 16:29 geoffclare Line Number => 75780
2021-02-16 16:29 geoffclare Interp Status => ---
2021-02-16 20:44 kre Note Added: 0005241
2021-02-16 20:55 kre Note Added: 0005242
2021-02-16 21:02 kre Note Edited: 0005242
2021-02-17 10:07 geoffclare Note Added: 0005243
2021-02-17 13:18 kre Note Added: 0005244
2021-02-17 14:26 geoffclare Note Added: 0005245
2021-02-18 10:32 kre Note Added: 0005246
2021-02-25 09:41 geoffclare Note Added: 0005252


Mantis 1.1.6[^]
Copyright © 2000 - 2008 Mantis Group
Powered by Mantis Bugtracker