Austin Group Defect Tracker

Aardvark Mark IV


Viewing Issue Simple Details Jump to Notes ] Issue History ] Print ]
ID Category Severity Type Date Submitted Last Update
0000553 [1003.1(2008)/Issue 7] Shell and Utilities Editorial Enhancement Request 2012-04-03 10:03 2012-04-10 01:46
Reporter oiaohm View Status public  
Assigned To ajosey
Priority normal Resolution Open  
Status Under Review  
Name Peter Dolding
Organization
User Reference
Section newsection
Page Number newsection
Line Number 0
Interp Status ---
Final Accepted Text
Summary 0000553: Add new type of shell and shell feature requirement for new shell in posix.
Description This to address 545 issue. Of shell doing stupid things with filenames containing - and <space> and other shell process-able chars.

Jonathan Nieder
"zsh does this. (You may want to use "rm -- *" to cope with filenames
starting with a minus sign.) However, zsh is not a POSIX-style shell
until you run "emulate sh", and at that point it stops doing this.

I believe the best way to accomplish what oiaohm is asking for is to
use tools like zsh rather than sh. Until such tools are ubiquitous,
it doesn't seem like a matter for standardization, though --- anyone
pointing to a standard and thinking their application that makes use
of zsh without documenting that dependency is portable is fooling
themselves. Luckily zsh itself is fairly portable and not too
difficult of a dependency to carry."

Due to what zsh does file-names containing \n are also not a issue since the command line is not processing them.

It is a matter for standardisation since new shells are implemented from the standard. It is the standard with the bug so its being duplicated from the standard into new shells. So since the current standard leads to user hating shells that are basically land mined against new users something has to change in standard to prevent this. So we can leave user unfriendly shells behind. So in 10 years time looking back we are like why were we so stupid even to use shells like the ones we had today.

To make the correct handling ubiquitous standard has to be fixed so it becomes ubiquitous.

Hey my last attempt might have been the wrong way. But we have to keep on looking at this problem until we find a path that works.

Proper argument handling added to shell in the form of array to represent exec as well. C based exec is fairly bullet proof from errors.

Currently -- is not implemented on every program there is no record of what program do or do not support this feature. A list one way or the other done in some form could be highly useful to shell implementers to prevent end user pain from unexpected events.

Items like this made 100 percent predictable in sh2 "for i in *; do echo $i;cat $i ; done" So this will always in sh2 mode list all files in current directory and titled.

Possible introduction of a cescape function that could be called on $i in echo to prevent \n and other things causing display issues.

cescape would be an addon to this bug to kill 251 as well. Without it 251 still exists.

Finally correct 545 as duplicate of this bug not 251.

I am still trying to address the same issue here standard posix shell being unpredictable to a new shell user.

Current Standard shell expecting new user to know way to much and to be getting there input correct all the time. New users are not going to be perfect.

Again there is a possibility this might take out most of 251 as a side effect.

This might be happier since it does not bring filenames containing null and / into existence as 545 did.
Desired Action One protection that filenames wildcarding do not token like zsh unless directed to.

emulation of old behaviour that is trouble making for users to be retained.

Selection of name for new version like /bin/sh2

emulate command becomes part of the requirement to be a posix shell.

emulate will support three options at least. "emulate sh" and "emulate sh2" set mode emulate posix reports what posix standard shell is currently being emulated. So currently "emulate posix" would return sh or sh2 or not_standard. We hope we don't ever need a sh3. So something doing emulate bash might report when emulate posix is called sh because that is the closest posix emulation.

All new shells to be posix compatible must be able to perform as a sh2 or not be approved as a posix shell. Old shells to form plan to support sh2.

sh2 supporting shell with arg0 /bin/sh will run as emulated sh until directed otherwise. sh2 supporting shell with arg0 as /bin/sh2 will run as sh2 until directed otherwise.

When not started as /bin/sh or /bin/sh2 shell must run at highest current shell standard.

Processing mode can be switched as required.

Standard list location for applications than support or don't support(this is open to debate what way) -- so that shell can auto apply -- to prevent rm -i * with -rf file in existence from causing unexpected user result. Yes I know this is shell intelligence. Users should not need to know what command needs or does not need --. man files are not always installed for user to look up either.

There is also a need for a blacklist in sh2 mode. This application must never ever be sent file-names from a wild-cards starting with - due to the fact it has no protection against malfunction. This can even be failure to run and inform user of what they attempt todo in the hope application maker updates and adds support for -- as per standard so can be removed from blacklist. Better to fail out right than do something the user is not wanting.

yes user doing <rogue_non_conforming_application> -- file kinda need to be informed that this is not going to work as expected because program is bust.

Remember user cannot just do -- on everything since if application does not support -- it still does not do what the user wanted. Like might do something stupid like create -- file on file system. User needs to be informed this program is bad not to standard.
Tags No tags attached.
Attached Files

- Relationships

-  Notes
(0001186)
jrn (reporter)
2012-04-03 13:34

> Jonathan Nieder
> I believe the best way to accomplish what oiaohm is asking for is to
> use tools like zsh rather than sh. Until such tools are ubiquitous,

My bad. It is even simpler to use standard "sh" with

      IFS=''

at the top of your script, as described at [1].

Sorry I overlooked that before.

[1] http://thread.gmane.org/gmane.comp.standards.posix.austin.general/5662/focus=5663 [^]
(0001187)
ranjit (reporter)
2012-04-04 11:12

I understand the reporter's concern, however I feel that it is not appropriate for POSIX to specify a new shell in advance of any implementation, nor do I feel it is right to change how shells work: ultimately a user has to learn a little bit in order to use a shell effectively, and that is not going to change even with this proposal. Further the standardisation we have is explicit enough, and simple enough, for any capable developer to implement a POSIX-compatible shell. This proposal does not seem either: it seems to be arguing to disable word-splitting after parameter-expansion, which would break an awful lot of uses of sh, eg makefiles and most scripts that I've seen.

This proposal (or the issues around it), as 545 before it, would be better discussed on the mailing-list; could I request that you subscribe, if you haven't already, and post to that list, oiaohm? http://www.opengroup.org/austin/lists.html [^] - austin-group-l is the correct list. There were people commenting there on your last proposal, but I did not see any responses from you there.

As it is, this proposal does not have any current implementation to discuss, so does not seem suitable for standardisation, even if there were no other issues.

> Due to what zsh does file-names containing \n are also not a issue since the
> command line is not processing them.

They aren't processed in other shells either. for i in *; do ..; done works fine with every filename, so long as you quote "$i" when you use it. If you are concerned about hyphens at the start when passed to commands, the standard practise is to glob on ./* instead, which gives you similar filenames to `find .` ie ./-rf in your example.

So for your rm command, one would just use: rm ./*

> Items like this made 100 percent predictable in sh2
> "for i in *; do echo $i;cat $i ; done"
> So this will always in sh2 mode list all files in current directory and titled.

This is just bad shell-script. You should hang out in #bash on chat.freenode.net; this kind of script instantly gets hit with !quotes:
<greybot> "USE MORE QUOTES!" They are vital. Also, learn the difference between ' and " and `. See <http://mywiki.wooledge.org/Quotes> [^] and <http://wiki.bash-hackers.org/syntax/words> [^]

So your example would be written:
for i in ./*; do echo "* File: ${i#./}:"; cat "$i"; echo; done
We'd use an extra echo to handle files that did not end in a newline. Colons, and a hint to the user, around the filename, make it clear what the filename is, though echo "$i" would of course work, and again would not have any issues with eg -nfoo as a filename, since "$i" would actually be "./-nfoo"

Ultimately you cannot prevent odd characters in filenames: you can however make your scripts robust at dealing with any character.

Trying to stop word-splitting in general is a bad idea; think of say $CFLAGS.

> Proper argument handling added to shell in the form of array to represent exec
> as well. C based exec is fairly bullet proof from errors.

Again, this display ignorance of basic shell-scripting, afaict. Parameters to a shell-script are exactly the same as strings passed via exec* (and of course a shell script can be called via exec*, and usually is.) Again, all that needs to be done is to quote expansions, eg "$1" or "$@" to properly deal with all characters. for i; do.. will process all parameters fine, in the same way that a glob expansion does not lead to filenames being split.

So, your "array to represent exec" (which I take to mean "array to represent argv") is already there: "$@" (and ofc $# for argc.)

> Currently -- is not implemented on every program there is no record of what
> program do or do not support this feature. A list one way or the other done in
> some form could be highly useful to shell implementers to prevent end user
> pain from unexpected events.

Certainly I think it would be useful if it were mandatory for all utilities specified in xsh to support --. I believe at the moment it's a recommendation for applications, but it could be required for conforming implementations, not sure.

Anyway, I hope to hear from you on the mailing-list, if that's okay. That's where proposals should be discussed if they are embryonic, imo. ATM you've raised concerns about the shell interface, but you need to discuss the issues with others more before making concrete proposals for specification changes to address them, I think.
(0001191)
ajosey (manager)
2012-04-07 05:22
edited on: 2012-04-10 10:00

This item is being left open for now since it falls into the category of a New Work Item. The submitter is invited to make a full proposal as per the criteria for new work items.

The requirements for New Work Items are documented in

Document Number: AUSTIN/112r3

Title: Committee Maintenance Procedures for the Approved Standard

http://www.opengroup.org/austin/docs/austin_112r3.txt [^]

The recommended criteria for development of new interfaces to enable
them to be considered for inclusion in a future revision are as follows:

1.There must be a written specification that has undergone a formal
consensus based approval process and is suitable for inclusion.

Parties interested in submitting new work items through one of the
three organizations within the Austin Group (The Open Group, IEEE, ISO/IEC)
should contact the appropriate Organizational Representative for further
information and advice on how each organization handles new work items.
Submissions from other organizations will also be considered.
Items 2 through 4 below apply to all submissions regardless of
origin.

2.There must be an implementation, preferably a reference implementation.

3.The specification must be "sponsored" by one of three organizations
(The Open Group, IEEE, ISO/IEC) within the Austin Group, i.e. they would
support and champion its inclusion.

4.Submitters must provide an outline plan of the editing instructions to
merge the document with the Austin Group specifications, and assistance
to the Austin Group editors as required to complete the merger.
For an example, see

https://collaboration.opengroup.org/platform/single_unix_specification/protected/mailarch.php?soph=N&action=show&archive=austin-group-l&num=434 [^]

or
https://collaboration.opengroup.org/sophocles/mailarch.php?soph=Y&action=show&archive=austin-group-l&num=434 [^]



Andrew Josey, Austin Group Chair.

(0001192)
oiaohm (reporter)
2012-04-10 01:46

ranjit this comes down to new person when they read the following line what are they expecting to happen.

This is just bad shell-script. This really does not change the fact. User expect X and system does Y.

"So for your rm command, one would just use: rm ./*" A new user is not going to do this. Same with idea of "$i".

The result is you are saying you have to type more chars to get exactly what you are expecting.

"ultimately a user has to learn a little bit in order to use a shell effectively, and that is not going to change even with this proposal."

This is true user has to learn a little bit in order to use a shell. But why is the user having to learn that rm * does not work so they have to use rm ./* so it does as expected or rm -- * works as well by the way. So now you have a confused new learner. Remember ./ also remember the application you might be dealing with might not be rm so might not like ./. rm I just picked as a example of a common application.

Multi ways of writing the same thing don't help learners.

Basically if new user has learnt that rm is remove and * is wildcard that should be enough that the application does what is expected form the new user point of view.

Current design of shell increase the required knowledge to use the shell so making less users use it.

Why because they are getting told two different directions to get the job done to count where the direct does not work as expected.

I have got to do the mailing list subscribe as some point.

ranjit
"Trying to stop word-splitting in general is a bad idea; think of say $CFLAGS."

This is something I wish to make very clear. $CFLAGS can go badly wrong again due to splitting. What if someone has set IFS to , $CFLAGS has , opps. Now everything goes badly wrong.

Really with what can go wrong with $CFLAGS is more reason to kill out splitting of strings unless particular called operation is performed on them. split function of some form called on string. Make the code very clear at this point its not one value any more it is going to be many.

Really should $CFLAGS really be a string. Or should it always have been an array. Due to the really horrid abuse of strings and lack of use of array I cannot kill the old /bin/sh outright.

The arguments that go into programs have to be converted to an array this is where the splitting is coming in. Anything where you don't control the split point is asking for something to token wrong. Particularly when you remember in sh you can change the token char.

A string in my mind should not split into many arguments is one piece user should have the right to think of it as one piece. If the thing is an array since that is many pieces that should pass into many arguments.

This is fully about predictability. You create something as a single piece it don't by magic turn into many just because you missed that X var you include to make the string had a space in it.

This magic split is why you read threw a shell script and it appears to be fine and it does something no expected. Because you missed doing -- somewhere or./ somewhere or any other many things that can lead to a token char being in your string.

Its not like declaring array is hard array=( zero one two three four five ) This way each argument can contain spaces and it would be very clear reading of what is an argument and what is not.

When you look at current day shell we do a lot of stupid things. sh support arrays yet we use PATH vars that are tokened by : on the command line we token by space.

Really what was the sh shell and application doing is inconsistent and possible trouble with the usage of environmental strings.

Null terminated arrays are the most dependable struct we have.

Most of the arguments to allow $strings to be split automatically when you look closer are arguments really to forbid this. Since there are too many ways for it to go south.

ranjit to me the idea of autosplitting is really playing dice with command line and crossing fingers that something with a token gets spotted.

IFS is something else that sh added that is basically highly dangerous for leading to a point that it does not get cleared to causing scripts to go and do nuts things.

Fixing this up cures a lot of problems. Filenames with strange chars in less of a problem. Making users tell shell directly where they want stuff tokened so shell cannot guess wrong is a major step forwards.

Some of these evils is why system V init scripts were hated. You made a mistake somewhere and the error could show up many scripts over.

Its not like the issues are new.

ranjit other than legacy support really don't see any good longterm reason to support tokening of $strings automatically. Tokening of strings made sense when shells did not array struts as an option. Even so converting a string to an array by a token command of some form would still be valid. But this should be a command user has to add. Computers trying to think for humans create confused humans and bad outcomes.

Simply today with the features in shell auto splitting strings makes no sense when you have the option to use array and directly steer the beast to give you exactly what you want.

I would call most of the calls to keep the auto splitting as people with bad programming practices that wish new users of shell to have to learn these evils to get what they want out the shell. We should not keep this in the standard if there was a way to remove it completely.

Mailing list will take some time to setup.

So Andrew Josey so idea is not going to be killed from the get go and has better chance??

- Issue History
Date Modified Username Field Change
2012-04-03 10:03 oiaohm New Issue
2012-04-03 10:03 oiaohm Status New => Under Review
2012-04-03 10:03 oiaohm Assigned To => ajosey
2012-04-03 10:03 oiaohm Name => Peter Dolding
2012-04-03 10:03 oiaohm Section => newsection
2012-04-03 10:03 oiaohm Page Number => newsection
2012-04-03 10:03 oiaohm Line Number => 0
2012-04-03 13:34 jrn Note Added: 0001186
2012-04-04 11:12 ranjit Note Added: 0001187
2012-04-07 05:22 ajosey Note Added: 0001191
2012-04-10 01:46 oiaohm Note Added: 0001192
2012-04-10 10:00 ajosey Note Edited: 0001191


Mantis 1.1.6[^]
Copyright © 2000 - 2008 Mantis Group
Powered by Mantis Bugtracker