View Issue Details

IDProjectCategoryView StatusLast Update
00012351003.1(2016/18)/Issue7+TC2Shell and Utilitiespublic2024-06-11 09:08
Reporterstephane Assigned To 
PrioritynormalSeverityObjectionTypeEnhancement Request
Status ClosedResolutionAccepted As Marked 
NameStephane Chazelas
Organization
User Reference
Section2.9.4.3 Case Conditional Construct
Page Number
Line Number
Interp Status---
Final Accepted Text0001235:0004434
Summary0001235: explicitly prohibit strcmp fallback in case statement
Description(another follow up to bug:1190)

In the Bourne shell, ksh88 and ksh93:

    case [ab] in
      [ab]) echo match
    esac

outputs "match" which is quite surprising and dangerous as it could bypass input validations like:

    case $1 in
      [0123456789]) : OK;;
      *) echo >&2 not a decimal digit; exit 1;;
    esac
   
Possibly the rationale was to align with another (mis)feature introduced by the Bourne shell, where:

    rm [ab]

would remove the [ab] file if no file matched the pattern (instead of cancelling the command in earlier sh implementations (and csh, tcsh, fish zsh)).

POSIX currently doesn't allow that ksh88/ksh93 behaviour. But since it is a deviation from the reference implementation and since most certified systems whose shell is based on AT&T ksh still have that non-conformance, it would be nice to make it explicit that that behaviour is not allowed.
Desired ActionAdd a conformance test case that rejects that behaviour.

Add a rationale section stating something like:

The Bourne and Korn shells used to revert to a byte to byte comparison when wildcard patterns didn't match in a "case" statement, that behaviour was considered undesirable and is not allowed by this specification.
Tagstc3-2008

Activities

kre

2019-03-11 02:17

reporter   bugnote:0004294

In case it is not obvious from other notes (attached to other issues) and
from messages on the mailing list, I completely agree with this - any shell
which allows a strcmp() of the pattern and word to be considered a match
is simply abhorrent (regardless of how ancient this practice was). There is
no need for it - one can always simply do
    case word in
    ( pattern | "pattern" ) ... ;;
    esac
if it is intended to match a pattern as either a pattern or a string.

I might go a little further, and actually add text to the normative part
of the standard to explicitly outlaw this practice, something like

    The shell shall not treat a pattern as a string and apply an additional
    match against the pattern treated as if it contained no wildcard characters.

except with better wording.

In any case, we need to make it quite clear that shells which do this (whatever
their heritage) are non-conforming, and that it is not required of an
application (script) to attempt to defeat this behaviour, with code like

    case word in
    ( "pattern" ) the non-match code here;;
    ( pattern ) the matching code here;;
    ( * ) the non-match code here ... again;;
    esac

as that's revolting, and not always easy to accomplish. Even ;& and
such are no help at all for this.

The only thing in the description I disagree with is labelling of the
glob behaviour of returning an unmatched pattern as a literal string.
Dealing with that is much easier than dealing with the consequences of
producing an error in the case of an unmatched pattern.

stephane

2019-03-11 07:05

reporter   bugnote:0004299

Note that in ksh93, the strcmp() seems to be done after backslash removal:

$ a='[a]' ksh -c 'case $a in $a) echo match; esac'
match
$ a='\a' ksh -c 'case $a in $a) echo match; esac'
$ a='[a]' b='[\a]' ksh -c 'case $a in $b) echo match; esac'
match

geoffclare

2019-06-20 10:54

manager   bugnote:0004434

Suggested change:

On page 3744 line 128516 section C.2.9.4 Compound Commands, add a new paragraph:

Some historical shells would fall back to doing a byte to byte comparison with each pattern if the pattern matching rules did not produce a match. That behavior is not allowed by this standard because it allows user input to bypass input validations like:
    case $1 in
      [0123456789]) : OK;;
      *) echo >&2 not a decimal digit; exit 1;;
    esac

Issue History

Date Modified Username Field Change
2019-03-09 00:46 stephane New Issue
2019-03-09 00:46 stephane Name => Stephane Chazelas
2019-03-09 00:46 stephane Section => 2.9.4.3 Case Conditional Construct
2019-03-11 02:17 kre Note Added: 0004294
2019-03-11 07:05 stephane Note Added: 0004299
2019-03-11 15:53 eblake Interp Status => ---
2019-03-11 15:53 eblake Summary explicitely prohibit strcmp fallback in case statement => explicitly prohibit strcmp fallback in case statement
2019-06-20 10:54 geoffclare Note Added: 0004434
2019-06-20 15:28 geoffclare Final Accepted Text => 0001235:0004434
2019-06-20 15:28 geoffclare Status New => Resolved
2019-06-20 15:28 geoffclare Resolution Open => Accepted As Marked
2019-06-20 15:28 geoffclare Tag Attached: tc3-2008
2019-11-14 14:31 geoffclare Status Resolved => Applied
2024-06-11 09:08 agadmin Status Applied => Closed