|Anonymous | Login||2021-02-28 12:22 UTC|
|Main | My View | View Issues | Change Log | Docs|
|Viewing Issue Simple Details|
|ID||Category||Severity||Type||Date Submitted||Last Update|
|0001226||[1003.1(2016/18)/Issue7+TC2] Shell and Utilities||Objection||Error||2019-01-24 19:00||2019-12-04 11:46|
|Priority||normal||Resolution||Accepted As Marked|
|Organization||SHware Systems Dev.|
|Final Accepted Text||See Note: 0004394|
|Summary||0001226: shell can not test if a file is text|
With the sentence "If the executable file is not a text file, the shell may
bypass this command execution.", POSIX makes no distinction between binary formatted files and files formatted as text according to some locale, so there is no standard way to determine what the contents of an arbitrary file represent. As such, after an exec() fails the shell just knows the file had an appropriate x-permission bit set but isn't in the format exec() expected. It is up to the invoked shell to determine this is not text by eventually getting some syntax or grammar error when reading the file. That platform's exec() may just recognize ELF binaries, and the file may be a COFF or OMF binary for access by other platforms over a network, as example, and not a script.
What can be tested is the type of a file, and it's more precise to say command execution can always be bypassed if the file's type precludes the possibility of it being a source of scripts, such as a directory or symbolic link, and a platform may elect to treat types other than regular files as binary-only oriented and skip trying to process these also. Regular files are the only type the standard requires as able to persist text data, and that sentence should reflect this.
If the executable file is not a text file,
If the executable file is not a regular file,
edited on: 2019-01-24 20:17
A shell may not be able to prove whether a file is a text file, but can easily filter out a number of non-text files with a short read() or two (any file that contains a NUL byte, any file that does not end in a newline, any file that does not contain a newline within LINE_MAX bytes) regardless of locale, as well as the converse (any file that starts with #! might be treated as a text file rather than a binary to pass to exec(), even though POSIX itself states that the use of #! is non-portable). You are also correct that a file can be a text file in one locale but a non-text file in another locale, based on whether byte sequences cause encoding errors in one locale but not the other.
Given that existing shells also have various other heuristics (such as any iscntrl() hits in the first 512 bytes) for deciding whether a regular file is unlikely to be executable as a shell script (although it does not necessarily make the file a non-text file). I'm not sure if the desired action is sufficient to permit what existing shells do.
Yes, but that easy filtering still the province of the invoked shell, as things are written, possibly to return 126 as exit code. This applies when the file exists but doesn't have the x bit set too. These will cause exec() to fail, but the shell is still expected to try and process it as a script. As far as the logical model goes, how I read it, what heuristics may be implemented are an adjunct to, not replacement for, the line-by-line evaluations of XCU 2.3 once the invoked shell finishes initializing.
What this change does is make it so regular files are always included for consideration, but with the "may bypass" wording allows other file types to be included too. While it would usually be silly for a platform to process the data representing a directory as a script, it shouldn't be impossible to attempt it if the internal format can be considered text also, similar to tar headers and the expectations for ar archives. If the underlying file system does say 'no, the inode fields are always binary encoded', big or little endian, then I'd expect the shell to bypass trying to process it because that's defined as being non-text for the file type.
I am not sure of the point of this discussion.
This relates to the proposed resolution of 1161 I believe?
In that, the shell is already required to find the file it
would exec (somehow) - whatever rules it follows in making
that decision are appropriate here do, and the description
of the -v option to the command command does not need to list
lots of special cases to make this be OK.
The problem with the resolution to 1161 is recorded in
note 4223 attached to that bug report.
If this issue really is simply about what 2.9.1 (really
18.104.22.168.1.e.i.b)says then I agree with Ed, nothing more
is needed than what is there "not a text file" covers all
that is needed - it certainly includes devices, directories,
etc - and also allows the shell to avoid attempting to
interpret whatever files it considers to not possibly be
That's a good thing - we do not need to be specific about
just which files it should attempt to run, and which it
should not (especially these days as the universal use of
#! means that any script can make itself guaranteed
executable, and so the shell can be much more conservative
about what files it attempts to run as a script when the
A problem with the current wording that doesn't seem to have been mentioned yet is that it allows the shell not to execute the file as a shell script if it contains lines longer than LINE_MAX characters. However, the sh page says under INPUT FILES: "The input file shall be a text file, except that line lengths shall be unlimited."
Therefore as a minimum we should change "is not a text file" to "contains any NUL characters or any byte sequences that do not form valid characters", or maybe to "does not meet the requirements for a sh input file stated in the INPUT FILES section in [xref to sh]" in order to avoid duplication.
|There are "scripts" in the wild that are self-extracting - the first half of the file used in isolation as a text file, and ends with 'exit' or similar to prevent the shell from even parsing the second half; then the second half contains a binary payload that is processed by the first half. Do we want the standard to permit such files as a valid shell script, or are they non-portable because of the non-text nature of the second half of the file, even though the shell does not reach that part of the file to parse it?|
edited on: 2019-05-16 16:41
The standard is unclear on this issue, and no conformance distinction can be made between alternative implementations based on this. This is being referred to the sponsor.
The shell Input File definition states "The input file shall be a text file, except that line lengths shall be unlimited." This conflicts with the requirements stated here. Some current implementations do not make any check here, others use a simple heuristic to determine if the file may be a script.
Notes to the Editor (not part of this interpretation):
At page 2368 line 75615, replace:
If the executable file is not a text file, the shell may bypass this command execution.
The shell may apply a heuristic check to determine if the file to be executed could be a script and may bypass this command execution if it determines that the file cannot be a script. In this case, it shall write an error message, and shall return an exit status of 126.
|The issue of whether the shell should permit non-text files as input has been split into the separate 0001250|
|Interpretation proposed: 7 October 2019|
|Interpretation Approved: 11 Nov 2019|
|2019-01-24 19:00||shware_systems||New Issue|
|2019-01-24 19:00||shware_systems||Name||=> Mark Ziegast|
|2019-01-24 19:00||shware_systems||Organization||=> SHware Systems Dev.|
|2019-01-24 19:00||shware_systems||Section||=> XCU 2.9.1|
|2019-01-24 19:00||shware_systems||Page Number||=> 2368|
|2019-01-24 19:00||shware_systems||Line Number||=> 75592|
|2019-01-24 20:15||eblake||Note Added: 0004221|
|2019-01-24 20:17||eblake||Note Edited: 0004221|
|2019-01-24 21:39||shware_systems||Note Added: 0004222|
|2019-01-25 01:35||kre||Note Added: 0004227|
|2019-01-25 01:57||kre||Note Added: 0004228|
|2019-01-25 02:46||eblake||Relationship added||related to 0001161|
|2019-05-02 14:41||geoffclare||Note Added: 0004384|
|2019-05-13 16:00||eblake||Note Added: 0004393|
|2019-05-16 16:40||nick||Note Added: 0004394|
|2019-05-16 16:40||nick||Note Edited: 0004394|
|2019-05-16 16:41||nick||Note Edited: 0004394|
|2019-05-16 16:42||nick||Interp Status||=> Pending|
|2019-05-16 16:42||nick||Final Accepted Text||=> See Note: 0004394|
|2019-05-16 16:42||nick||Status||New => Interpretation Required|
|2019-05-16 16:42||nick||Resolution||Open => Accepted As Marked|
|2019-05-16 16:42||nick||Tag Attached: tc3-2008|
|2019-05-16 19:23||eblake||Relationship added||related to 0001250|
|2019-05-16 19:28||eblake||Note Added: 0004395|
|2019-10-07 15:17||agadmin||Interp Status||Pending => Proposed|
|2019-10-07 15:17||agadmin||Note Added: 0004608|
|2019-11-11 12:20||agadmin||Interp Status||Proposed => Approved|
|2019-11-11 12:20||agadmin||Note Added: 0004652|
|2019-12-04 11:46||geoffclare||Status||Interpretation Required => Applied|
|2020-12-15 15:22||geoffclare||Relationship added||related to 0001435|
|Mantis 1.1.6[^] Copyright © 2000 - 2008 Mantis Group|