View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
0000767 | 1003.1(2008)/Issue 7 | Shell and Utilities | public | 2013-10-11 02:02 | 2022-08-08 15:15 |
Reporter | dwheeler | Assigned To | ajosey | ||
Priority | normal | Severity | Objection | Type | Enhancement Request |
Status | Closed | Resolution | Rejected | ||
Name | David A. Wheeler | ||||
Organization | |||||
User Reference | |||||
Section | XCU 2.14 (local) | ||||
Page Number | 2374 | ||||
Line Number | 75650 | ||||
Interp Status | --- | ||||
Final Accepted Text | |||||
Summary | 0000767: Add built-in "local" | ||||
Description | POSIX sh supports functions, but not the definition of local variables. This creates namespace pollution and is an especially serious problem for recursively-defined functions. Supporting local variables makes it easier to avoid errors, because a locally-defined variable will not quietly change a variable value used elsewhere with the same name. Many sh implementations already support local variables using "local", including dash and bash. ksh also supports local variables, but using "typeset" instead (http://www.manpagez.com/man/1/ksh/). Having a standard mechanism for creating local variables makes it possible to portably use them. Web pages that discuss "local" include https://wiki.ubuntu.com/DashAsBinSh#local and http://mywiki.wooledge.org/Bashism. The proposed definition is a subset of what both dash and bash support currently. dash only allows one variable in a local definition; it permits assignment though it doesn't document that clearly. bash allows more than one variable declaration (each separated by spaces). | ||||
Desired Action | Add the following as a new entry in "Special Built-in Utilities" between "export" and "readonly": NAME local — create, and possibly set, a local variable SYNOPSIS local name[=word] DESCRIPTION The variable whose name is specified shall be created as a local variable with name "name". It shall inherit its initial value, as well as the exported and readonly flags, from the variable with the same name in the surrounding dynamic scope if there is one; otherwise, the variable is initially unset. Then, if "=word" is provided, the value of that local variable shall then be set to word. From then on, any reference to that variable name will use this local variable until the enclosing function returns. Note that the shell uses dynamic scoping, not static or lexical scoping. Implementations MAY support defining more than one local variable name, separated by whitespace. When no arguments are given the results are unspecified. It is an error to use local when not within a function. OPTIONS See the DESCRIPTION. OPERANDS See the DESCRIPTION. STDIN Not used. INPUT FILES None. ENVIRONMENT VARIABLES None. ASYNCHRONOUS EVENTS Default. STDOUT See the DESCRIPTION. | ||||
Tags | No tags attached. |
related to | 0000771 | Closed | 1003.1(2013)/Issue7+TC1 | Expose alternate shell function usage to scripts | |
related to | 0000465 | Closed | ajosey | 1003.1(2008)/Issue 7 | is the list of special built-ins exhaustive (is "local" special)? |
related to | 0001025 | Closed | 1003.1(2013)/Issue7+TC1 | set description contains counterproductive claims | |
related to | 0001065 | Closed | 1003.1(2013)/Issue7+TC1 | Clarification request on invocations |
|
I've verified that mksh (the MirBSD Korn shell) also includes "local", and it also implies dynamic scoping (this is in addition to bash and dash). AT&T ksh93 doesn't have a "local" built-in at all, according to: http://web.archive.org/web/20130522130727/http://www2.research.att.com/sw/download/man/man1/ksh.html I realize that ksh93's "typedef" provides static scoping, but that is a different built-in name anyway. An implementation would always be free to *also* provide static scoping through nonstandard extensions. |
|
Yes, *please* standardise 'local'. busybox (ash) also provides a dynamic local (local is dynamic on Linux, since it is required for a sh to be permitted as /bin/sh on debian, and ofc everyone stays compatible with bash): $ foo() { local i="$1"; echo "$i"; bar; } $ bar() { echo "$i"; } $ i=baz; foo quux quux quux Are we okay with forcing an existing implementation which has long provided static scoping, to add dynamic scope? Personally I like dynamic scoping; we use it to its fullest in bash scripting, and I'd *much* rather have the same everywhere. Strong resistance was, however, expressed to the idea, so I'd like clarification. It certainly makes a lot of sense to have only one type from the scripter's point-of-view. In any event, we should ensure that multiple variables/definitions *are* allowed (with the caveat that: local v="$p" may require quoting, the same as with other special builtins: it does not in busybox nor bash, but I believe it does elsewhere, since the scripts in openrc, which works on freebsd and netbsd, as well as linux, all use that form.) I'm sure the dash people would be happy to comply: there's no real sense in having that restriction; it just makes scripts unnecessarily long-winded. It's certainly a lot less of an ask, than adding stacked variable definitions to a shell which does not already provide them. |
|
It's my understanding that new regular variable definitions default to read/write and not exported as attributes. I would expect a new local definition to behave similarly, so the function determines those attributes, not inherit them as part of initialization. This part of the proposal puzzles me, in other words, as it requires functions to determine what was inherited before an attempt at use in a conflicting way. I would not expect a function local variable to be exported, even with 'set -a' in effect, to save the hassle of pushing and popping values from the envp[] block when the function returns and the local name is undefined. The function uses an explicit 'export' after the local is defined, ok, that's saying it expects the performance penalty. If anything, I'd expect a readonly in the main name space to force a writable non-export only attribute as default. Once initialized, I could see readonly being used on a local as a debugging aid while testing the function, but not as something needing to be inherited to prevent overwriting a non-local definition. So, I can't see having this as a default requirement as promoting portability; just as adding unneeded complexity. As to syntax, I propose as an alternate, to add functionality and address the above concern: local name = ...; to initialize a specific variable with the text until an unescaped <command_sep>, after substitutions when called, so multi-word assignments can be handled without quoting being required, similar to the eval built-in. local name = $name; would be used to explicitly show an initialization from the main name space was desired, without attribute inheritance. local [-e] = name; or local [-e] = name1[ name2]... ; might be a shortcut for this, with the leading = indicating all words are name definitions that get initialized. With these forms a warning could be output to standard error as a debugging aid if $name references an unset variable. The explicit form would silently default to "" if not found as set. Switch -e explicitly enables export inheritance; readonly inheritance makes the definition superfluous as, when set, the behavior is required to be non-export or only use the main value in utility calls if export was set before readonly applied. Allowing functions to override this export behavior would be a security hole, as I see it. local name; local name1 name2 name3; would reserve names as if local name = ""; local name1 = ""; local name2 = ""; local name3 = ""; were executed, respectively, not force this syntax to check whether a prior definition exists as initialization. As with the first format, these would default to non-export and read write as attributes. A readonly attribute on a main name would require a writable local to be non-export only, whether the function sets that local as readonly during execution or not. Next is copied from mailing list, as related to logical name spaces the standard already requires: How ash, bash, dash, ksh, etc. actually implement the support for them or what extensions are currently provided, and how, is not particularly relevant, nor is it intended to favor a particular implementation over another. This does get into possible changes required by other utilities, though, which the proposal here doesn't so much. I could see that a 'local' keyword would define variables in the same environment as the positional parameters, which would be searched for variable names before the main namespace. This would be an extension to the logical model I think all shells could support. It would be on the function to import a value from the main name space with this name to another local or as an initializer, and this interpretation would hold until the function return automatically undefines it so the main definition is visible still to the rest of the script (Ed. ", or unset is used"). Before a 'local name =...;' statement was encountered the function would still reference the main name space, and after an explicit 'unset name' undefines a local. Unset wouldn't even need a separate switch, it would just use the modified search order to determine which var gets undone. Addition, as clarification: With recursive usage, a recursive caller would be expected to pass any locals as arguments to the callee, and not expect to inherit any previous local definitions. An unset in a recursed call would expose the value set in the script's static namespace still, not a local defined in the caller. If the function applies the readonly attribute to a local name, unset should still undefine this so as not to block this 'unhiding' behavior. This keeps the call model consistent that only positional parameters initialize the function specific environment and local variables get initialized from one place, the script name space. I'd expect 'set arguments' to keep named locals, and only reset the positional parameters for that function call, when used in a function. Adding the capability to inherit locals on calls I'd consider an extension to that call model with limited utility for the complexity increase, be it as a static reference or just a dynamic value copy. With the behavior above changes to an implementations handling of a function call may be unnecessary; adding inheritance would require changes. Changes to how function return and unset are implemented are more likely, but these should be minimal. New possibility: If the sh executable is required to make a copy of the argv[] array passed in when a script starts, to isolate allocs and reallocs, this could be used as a top level environment so that the same local semantics would apply in the script outside of functions. A function that hasn't defined its own local of that name would reference that script level local definition as a dynamic override, around specific function calls, of a non-exported variable. For variables already marked export I think the export utility would need an extension to indicate a call to it inside a function with an override as argument would conflict with a previous or inherited (because it was initialized from the envp[] array) export flag. |
|
The problem with "local" is that it seems that no shell that implements it (including ksh88 with a built in alias to "typeset") seems to implement a behavior that could be seen as natural or orthogonal. The following is relevant for implementing "local": - the scope in this shell and () subshells - the bahavior with "unset" - The behavior with regards to the variable attributes such as readonly or export. - the scope with respect to called scripts Everything except the first item seems to vary from shell to shell. |
|
Requiring the variables to inherit the value of their parents will break existing scripts in a potentially security-relevant way. $ bash -c 'x() { local y; echo ${y:-z}; }; y=1; x' z $ dash -c 'x() { local y; echo ${y:-z}; }; y=1; x' 1 $ ksh93 -c 'x() { typeset y; echo ${y:-z}; }; y=1; x' 1 $ mksh -c 'x() { local y; echo ${y:-z}; }; y=1; x' z $ posh -c 'x() { local y; echo ${y:-z}; }; y=1; x' z Granted, this is easily circumvented (“local y=”), but there are existing scripts that make use of “local x” to initialise x to empty. |
|
Re: 0000767:0003699 > Requiring the variables to inherit the value of their parents will break > existing scripts in a potentially security-relevant way. Agreed. It's even more a concern for shells that can add type/attributes to variables (granted that falls out of POSIX which doesn't specify those) > > $ bash -c 'x() { local y; echo ${y:-z}; }; y=1; x' > z > $ dash -c 'x() { local y; echo ${y:-z}; }; y=1; x' > 1 dash doesn't have types/attributes though > $ ksh93 -c 'x() { typeset y; echo ${y:-z}; }; y=1; x' > 1 In ksh93, you need the function x { ...; } syntax to have local (static) scoping. $ ksh93 -c 'function x { typeset y; echo ${y:-z}; }; y=1; x' z With ksh88: $ ksh -c 'x() { typeset y; echo ${y:-z}; }; y=1; x' z > $ mksh -c 'x() { local y; echo ${y:-z}; }; y=1; x' > z > $ posh -c 'x() { local y; echo ${y:-z}; }; y=1; x' > z > Note that zsh assigns a default value to the variable (empty for scalar (resulting in 0 for integer/float scalars), empty list for arrays and hashes) > Granted, this is easily circumvented (“local y=”), but there are existing > scripts that make use of “local x” to initialise x to empty. Even then if "y" had been declared "integer" for instance in an outer scope and preserved upon "local", that would still be a problem. With regards to the interactions with "unset", see also https://www.mail-archive.com/bug-bash@gnu.org/msg19445.html (and the rest of the discussion) for (what I consider) a bug in the way mksh, yash and bash handle "unset" for locally scoped variables. |
|
Re: 0000767:0003699 I would call a script broken or at least non-portable if it expects that "local x" results in a empty initialized variable. It seems to be the "natural" expected behavior for "local" to just create a local variant. In any case, calling local x x="" gives you a state that is the same on all implementations that already support "local". If you add the current POSIX Bourne Shell: bosh -c 'x() { local y; echo ${y:-z}; }; y=1; x' 1 you already have a 50:50 ratio of what shell authors believe is useful. If we assume that the "local" implementation in mksh and posh (both descendants of pdksh) are the the same, we already habe a 3:2 ratio for the decision to let "local x" just create a local variant of an existing "x". |
|
re: 0000767:0003702 Not sure what you mean. AFAICT, among dash, bash, yash, mksh, posh, zsh, ksh88, ksh93, FreeBSD sh, only dash and FreeBSD sh (which share some ancestry and don't have types) preserve the value of the outer scope upon calling "local". A "local" that preserves the value from the outer scope is somewhat consistent with "export" or "readonly" so make sense in that regard. On the other hand, you're generally using "local" when you want to have a variable local to your function which you can mingle with freely without affecting the namespace of your caller at least. Wanting it to keep the value of the corresponding variable in the parent scope is not the common use case in my experience (and you can always do local var="$var" if that's what you want). I'll agree with you though that I wouldn't want to make assumptions on the initial value given to a just declared local variable. I would find it reasonable if POSIX left that unspecified. Now, a problem is that at the moment, it's hard to have a local variable in an initial "unset" state (for instance for $IFS to have default splitting behaviour in a local scope) portably. local var; unset var works in ksh88, ksh93 (using typeset in those and the "function f {" syntax), dash, FreeBSD sh, bash, zsh, posh but not in mksh nor yash (whose "unset" pops the variable off the stack (makes it no longer local)) local var alone works in ksh88, ksh88, ksh93, bash, posh, mksh, yash, but not zsh (assigns an empty value) nor dash/FreeBSD (as they keep the outer scope value). So one needs something like: local var; [ "${var+set}" ] && unset var Also note the variations in: $ mksh -c 'f() { local a; echo "${a-unset}"; a=2; }; a=0; a=1 f; echo "$a"' unset 1 $ bash -c 'f() { local a; echo "${a-unset}"; a=2; }; a=0; a=1 f; echo "$a"' 1 0 $ zsh -c 'f() { local a; echo "${a-unset}"; a=2; }; a=0; a=1 f; echo "$a"' 0 $ dash -c 'f() { local a; echo "${a-unset}"; a=2; }; a=0; a=1 f; echo "$a"' 1 1 $ yash -c 'f() { typeset a; echo "${a-unset}"; a=2; }; a=0; a=1 f; echo "$a"' unset 1 $ ksh93 -c 'function f { typeset a; echo "${a-unset}"; a=2; }; a=0; a=1 f; echo "$a"' 1 0 $ posh -c 'f() { local a; echo "${a-unset}"; a=2; }; a=0; a=1 f; echo "$a"' 1 0 |
|
Re: 3702 Treating locals like autos in C, these are garbage or unset until initialized by the function. C doesn't require zeroinit, after all. Expecting this makes current scripts non-portable, but so is the use to begin with. Any explicit default value (inherited if set, zero length string, a zero numeric value) is going to conflict with what someone figures is the "proper default", so broadest usability comes from leaving it unset, imo.... If you need to pass values from the global scope, that is already handled by positional parameters to the function, as in "local var=$1;". I'd expect something like "local var=$var;" to be a zero length string, to be consistent with "unset var; var=$var;" in the global scope, but that's my preferred proper default. |
|
Re: 3704 "I'd expect something like "local var=$var;" to be a zero length string" Sorry, but that makes no sense at all, the arguments to any command are expanded before the command is executed, so if we have var=foo .... local var=$var what the local command sees is "local var=foo" - it has no idea that the value "foo" came from $var previously, and nor would anyone want it to. It is nothing even vaguely similar to "unset var; var=$var;" And for this it makes no difference at all what your preferred model on how local should work is (unless you wanted to make it a syntax element of function definitions, rather than something akin to export and readonly, I suppose, and executable at any point of the function.) My model for shell variables has always been that they are all global, all contains character strings as values (if they have a value), and all possible variables always exist (none are ever "created"), they can be set or unset, exported or not exported, readonly or not readonly, but they always all exist. What the "local" command does is not (in my model anyway) creating some new variable, but saving the value and attributes of the existing variable, in such a way that the shell automatically restores them when the function within which local is executed returns. With this model, everything is simple and consistent, there's no need to modify any of the other variable updating/using/lookup/... aspects of the shell to make local work, just that command, and the function return sequence (which already has to restore the $@ and $# so a few more like those is not all that difficult.) In particular, weirdness like "unset var" causing the previous value of var to appear just doesn't happen, an unset variable is just unset, at least until the function returns when its previous value/attributes reappears. This also means that if a readonly variable is made local, it remains readonly (avoiding the possibility that a function can make a new local, read-write var, change it, and then because of dynamic scoping, other called functions use a value different from what they had been expecting.) On the other hand, if a local variable is made readonly, when the function returns, the readonly attribute vanishes, and the variable is returned to its previous state. I will admit I had never even considered the possibility that new local variables would start unset, I do stuff like "local IFS" in a lot of my functions, not always because I expect to alter IFS, but because one day I might, and I don't want to accidentally have that affect the world outside the function. But until that happens, I expect IFS to just be unchanged. Of course, I mostly program ash derived shells, where that is the way it is implemented. |
|
"local" would not be a regular command... it's a keyword that establishes a context for the identifier like '{' and '}', or '(' and ')', or possibly a special built-in. That's how I'm thinking of it, so any arguments get evaluated only in the local context of the function, not inherit stuff like with establishing a subshell using '(' ')'. I expect var to be established as a new variable when the '=' is recognized, so $var on the right hand side of the '=' references it, not var in the global scope. For changing a global value locally I'd expect a function to use something like "local tmpIFS=$IFS;", because then the shell would see there is no local IFS so looks in the global var space. This is consistent with if a script is given 9 arguments, calling a function with only one argument does not keep arguments $2 to $9 visible inside the function. These get unset and $# evaluates to 1. If someone is developing a script library, use of local variables shouldn't have to be concerned that a calling context has set the value to a particular variety of string or not. This is how zsh appears to implement its version of local, anyways, based on the examples given. |
|
Re: 0000767:0003704 [...] > If you need to pass values from the global scope, that is > already handled by positional parameters to the function, as > in "local var=$1;". [...] In addition to what Robert said, I'd add that so far the consensus seems to be that "local" would implement dynamic scoping (arguably ksh93-style static scoping may be better, but all shells that have a "local" implement dynamic scoping, and dynamic scoping is consistent with the kind of scoping you get with subshells, and interacts better with the environment (export)), and we can consider adding static scoping using a different syntax (like zsh's "private") later if need be. In that context, a "global scope" makes little sense. local var="$var" would create a local $var that is copy of what was assigned to $var before, that is the $var of the function's caller's scope (which would not necessarily be the "global scope"). A global scope makes sense in ksh93 that implements static scoping. bash does have some implementation of a "global scope", where typeset works differently (doesn't reset the value for instance) and can be accessed in functions with "typeset -g", but that's not very useful and is cause of confusion in practice (like code that behaves differently when sourced from a function or from the top-level). mksh, yash and zsh also have a "typeset -g" but it's to be able to set the type without limiting the scope (which is generally what you want to use that for), so the variable stays in the same scope. In bash, you'd get the variable of the top-level scope, not the variable of the caller. In ksh93 (again with static scoping), you can use namerefs (typeset -n) to get access to a variable in the scope of the caller. In mksh/zsh, you can define a function that changes the type of a variable like: integer() typeset -gi "$1" f() { local var integer var ... } In ksh93: function integer { typeset -ni v="$1"; } # or integer() typeset -i "$1" function f { typeset var integer var ... } You can't do that with bash AFAIK (you can set the type of a local variable, you can set the type of variable in the global scope but you cannot set the type of variable in your caller's scope unless that's the global scope) All that to say that "global scope" makes little sense with dynamic scoping other than to give a name to the top-level scope. |
|
Re: 0000767:0003703 You discovered another problem: ksh93's unset in a function using typeset var behaves different from the orthogonal expectation. ksh93 seems to de-define the variable instead of getting the local variable into an unset state. As a result, ksh93 accesses the global variable from inside the funtion in case an assignment is done to an unset var. BTW: why do you believe: local var; [ "${var+set}" ] && unset var is needed? unset should not complain for non-existing vars. |
|
Re: 0000767:0003706 "local" is a normal builtin command like "unset", "export" or "readonly". In bosh, it creates a copy of the current variable value and then pushes it on some kind of variable stack. If you leave the function where local has been called, the pushed instance is popped. "unset" is a builtin command that removes the value from a variable and causes the variable pointer to become the value NULL. unset nonexisting first creates a new variable "nonexisting" and then applies the "unset" operation on that variable. |
|
Re: http://austingroupbugs.net/view.php?id=767#c3699 > Requiring the variables to inherit the value of their parents will break > existing scripts in a potentially security-relevant way. > > $ bash -c 'x() { local y; echo ${y:-z}; }; y=1; x' > z > $ dash -c 'x() { local y; echo ${y:-z}; }; y=1; x' > 1 > $ ksh93 -c 'x() { typeset y; echo ${y:-z}; }; y=1; x' > 1 > $ mksh -c 'x() { local y; echo ${y:-z}; }; y=1; x' > z > $ posh -c 'x() { local y; echo ${y:-z}; }; y=1; x' > z > > Granted, this is easily circumvented (“local y=”), but there are > existing scripts that make use of “local x” to initialise x to empty. The argument for the bash/mksh/posh behavior goes like this: In all current implementations, `local' is just a builtin command. Some implementations make it a `declaration command', but it's still just like export or readonly in that it gives a variable an attribute. Giving a variable an attribute doesn't give it a value, so a local variable should remain unset until it gets one explicitly assigned. |
|
Re: 0000767:0003710 bosh uses exactly your interpretation: "local var" just gives an attribute to $var. It does not change it's value. So after calling "local var", you still see the same value as before, but what you see is a pushed instance. |
|
Re: http://austingroupbugs.net/view.php?id=767#c3702 > I would call a script broken if it expects that "local x" results > in a empty initialized variable. > > It seems to be the "natural" expected behavior for "local" is > to just create a local variant. > > In any case, calling > > local x > x="" > > gives you a state that is the same on all implementations that > already support "local". > > If you add the current POSIX Bourne Shell: > > bosh -c 'x() { local y; echo ${y:-z}; }; y=1; x' > 1 > > you already have a 50:50 ratio of what shell authors believe is useful. > > If we assume that the "local" implementation in mksh and posh (both > descendants of pdksh) are the the same, we already habe a 3:2 ratio > for the decision to let "local x" just create a local variant of an > existing "x". Well, that's the rub. The question of what a `local variant' means is precisely what we're trying to decide. It seems to be consensus that a local variable is an object that `shadows' an instance of a variable with the same name at a previous scope (leaving the discussion of static or dynamic scope aside). The other questions are under debate. Why should an object that shadows another inherit its value? |
|
Re: 0000767:0003712 See also 0000767:0003711 "local" just installs the attribute "pushed" to a variable. Why should this operation change the value of an existing variable. |
|
Re: http://austingroupbugs.net/view.php?id=767#c3704 > I'd expect something like "local var=$var;" to be a zero length > string, to be consistent with "unset var; var=$var;" in the global scope, > but that's my preferred proper default. Only if this were internally implemented as something like local var; var="$var". |
|
Re: http://austingroupbugs.net/view.php?id=767#c3705 > I will admit I had never even considered the possibility that new local > variables would start unset, I do stuff like "local IFS" in a lot of my > functions, not always because I expect to alter IFS, but because one day > I might, and I don't want to accidentally have that affect the world > outside > the function. But until that happens, I expect IFS to just be unchanged. So your model is rather than `local x' creating a new instance of x, which remains unset until explicitly assigned a value, `local x' is equivalent to `local x="$x"'. |
|
Re: 0000767:0003708 > You discovered another problem: ksh93's unset in a function > using typeset var behaves different from the orthogonal expectation. > > ksh93 seems to de-define the variable instead of getting the > local variable into an unset state. > > As a result, ksh93 accesses the global variable from inside the > funtion in case an assignment is done to an unset var. I suspect you tested it using the Bourne function definition syntax ("foo()" instead of "function foo {"). As already said earlier in 0000767:0003701, in ksh93, you need to use: function foo { typeset var } to get local (static) scoping. There is no local scoping with the Bourne syntax. $ ksh93 -c 'f() { typeset var; var=bar; }; var=foo; f; echo "$var"' bar $ ksh93 -c 'function f { typeset var; var=bar; }; var=foo; f; echo "$var"' foo > BTW: why do you believe: > > local var; [ "${var+set}" ] && unset var > > is needed? unset should not complain for non-existing vars. As mentioned in 0000767:0003701 see https://www.mail-archive.com/bug-bash@gnu.org/msg19445.html In mksh and yash, unset doesn't unset, it pops the variable from the stack. After unset var, you get the variable from the caller's scope (!). bash does something similar but no when the variable has been declared in the current function which is why "local var; unset var" works OK there. $ ksh93 -c 'function f { typeset var; unset var; echo "${var-unset}"; }; var=foo; f' unset $ mksh -c 'function f { typeset var; unset var; echo "${var-unset}"; }; var=foo; f' foo $ yash -c 'function f { typeset var; unset var; echo "${var-unset}"; }; var=foo; f' foo Also related: https://github.com/att/ast/issues/38 I suspect zakukai is watching this bug... |
|
Re: http://austingroupbugs.net/view.php?id=767#c3706 > "local" would not be a regular command... it's a keyword that establishes a > context for the identifier like '{' and '}', or '(' and ')', or possibly a > special built-in. You could make it a special builtin, if you wanted to give it those properties, but there's no reason to make it a reserved word, and no shell does so. > That's how I'm thinking of it, so any arguments get > evaluated only in the local context of the function, not inherit stuff like > with establishing a subshell using '(' ')'. So, again, you want the `initialize unset' semantics and for `local x="$x"' to be implemented as `local x; x="$x"'. The `inherit stuff' doesn't hold water any other way, since functions always share variables and values with their caller. |
|
Re: 0000767:0003716 We should not call it "Bourne Shell functions" but "POSIX functions". |
|
IMO, there's little value in POSIX mandating one behaviour or the other considering that there is a lot of divergence among existing implementations. It can leave it unspecified whether "local" preserves the value of the variable as it was before the invocation of "local" (posh, ash-based) or whether it creates a variable in an initial "empty" (zsh) or "unset" (most other shells). An application should not make any assumption. And assign an initial value as required. About calling "unset" upon local variable, from the discussion (linked above, IIRC there's a separate discussion on the mksh ml) with the mksh, bash and yash maintainer, they do not consider the current behaviour of their shell as being a bug, so we may have to leave the behaviour unspecified for calling "unset" on a local variable. |
|
Re: http://austingroupbugs.net/view.php?id=767#c3709 > "unset" is a builtin command that removes the value from a variable > and causes the variable pointer to become the value NULL. > > unset nonexisting > > first creates a new variable "nonexisting" and then applies the > "unset" operation on that variable. You're describing the `bosh' behavior, right? Because unset doesn't work that way (internally creating variables just so it can remove them) in other shells. Whether or not we want unset to behave that way is a different, but related, question than the instantiation one we're considering now. We will eventually have to answer it. For reference, bash supports the `unset local removes the shadow and reveals the variable at the calling scope' semantics. |
|
Re: http://austingroupbugs.net/view.php?id=767#c3711 > bosh uses exactly your interpretation: "local var" just gives an attribute > to $var. It does not change it's value. So after calling "local var", you > still see the same value as before, but what you see is a pushed instance. Well, not quite. Bash's interpretation is that `local var' creates a new instance of a variable, not a copied instance of an existing variable. That new instance is unset until it's explicitly assigned a value. |
|
> http://austingroupbugs.net/view.php?id=767#c3713 > ---------------------------------------------------------------------- > Re: http://austingroupbugs.net/view.php?id=767#c3712 > > See also http://austingroupbugs.net/view.php?id=767#c3711 > > "local" just installs the attribute "pushed" to a variable. > > Why should this operation change the value of an existing variable. I think that's a really good summary of the question. Is a local variable really a variable from the calling scope with another attribute, and some other bookkeeping information to support dynamic scoping, added, or is it a new object with the same name as the old? |
|
Re: 0000767:0003719 If we like to discuss what should happen when unset is called, we should take into account that function calls may be nested. This may create several nested (or pushed) local instances of a variable. If calling "unset" pops such a variable instance, you would see the previous instance that may be the instance from another function call instead of the global value and I doubt that this is what users like to see. |
|
Re: http://austingroupbugs.net/view.php?id=767#c3723 > Re: http://austingroupbugs.net/view.php?id=767#c3719 > > If we like to discuss what should happen when unset is called, we should > take into account that function calls may be nested. This may create > several nested (or pushed) local instances of a variable. > > If calling "unset" pops such a variable instance, you would see the > previous instance that may be the instance from another function call > instead of the global value and I doubt that this is what users like > to see. This is just the dynamic vs. static scoping issue. |
|
Re: 0000767:0003718 > We should not call it "Bourne Shell functions" but "POSIX functions". There, I meant function definition style (syntax) function foo { ... } is the Korn style (predates the Bourne style according to dgk IIRC) foo() cmd is the Bourne-style POSIX went for the Bourne-style but with the restriction that "cmd" could only be a *compound* command (and nobody was able to tell the rationale for that, last time I asked here). So my integer() typeset -gi "$1" from earlier is Bourne, but not POSIX. And function f { ...;} is Korn, but not POSIX. In any case, that's beside the point. I was just pointing out that ksh93 doesn't implement local scoping in functions defined using the Bourne or POSIX syntax. |
|
Re: 3714 Yes, having = as part of the declaration more a shorthand for having to declare then assign. The idea is they get stored as separate names in the same block of memory holding the function's positional parameters, so when the function evaluation terminates the locals all go away when that block is released, not have to search through the whole global list looking for individual vars that have a local attribute set or maintain linked lists of those var names. |
|
Re: 3725 It's my understanding a function definition is limited to a compound-command so the definition implicitly ends at the closing reserved word of the compound, be it a keyword like esac or a '}', rather than clutter scripts with explicit function ... endfunc type keyword pairs. |
|
Re: http://austingroupbugs.net/view.php?id=767#c3726 > Re: 3714 > Yes, having = as part of the declaration more a shorthand for having to > declare then assign. The idea is they get stored as separate names in the > same block of memory holding the function's positional parameters, so when > the function evaluation terminates the locals all go away when that block > is released, not have to search through the whole global list looking for > individual vars that have a local attribute set or maintain linked lists of > those var names. This is an unnecessary implementation detail, but that's beside the point. Specifying `local' this way means it's different from every other builtin and declaration command, as kre already mentioned in #3705. There's little benefit to doing that. |
|
We're talking how shells are implementing it, this way mimics how compilers set up the stack frame for a function invocation, taking into account the shell is an interpreter instead so would add variables on the fly to the frame. It's the same type of specification as readonly var[=word]; If a var undeclared previously by assignment is declared this way it effectively blocks use of that name in subsequent assignments, so is considered always unset or the value of word. Similarly, an exported name can stay unset and not appear in the environment of a utility until it gets assigned a value. If a script inherits values with local a sequence like: unset foo; readonly foo; f() { local foo; foo=$IFS; }; should produce a name not found, or name not localizable error; since it would also inherit the "do not set this name" attribute of foo when trying to inherit the value. |
|
Re: Note: 0003715 So your model is rather than `local x' creating a new instance of x, which remains unset until explicitly assigned a value, `local x' is equivalent to `local x="$x"'. Approximately, though the question of the initial value is not the important issue here, the "rather than" part I most agree with is "creating a new instance of x". Whether the value of x after "local x" is unset, or unchanged from the value it had immediately before the "local" command was performed is of less importance - both schemes have their advantages for different applications, and I can see the benefit of creating a pair of flags for local (say "local [-ux] var[=[value]]..." as a synopsis - and leaving aside "local -" for now) where -u implies "unset vars not explicitly given a value" and -x means "use existing value for var is not overridden, and the default could be implementation defined. I think I might even implement that (and make the "implementation defined" be a "set" option which in my case would default to the -x case). (The options to "local" I have just done, works fine... a shell option I'll consider later.) Re: Note: 0003724 > If calling "unset" pops such a variable instance, you would see the > previous instance that may be the instance from another function call > instead of the global value and I doubt that this is what users like > to see. This is just the dynamic vs. static scoping issue. Not really, those are separate issues, the real question is whether "unset" actually destroys all knowledge of the variables named (assuming it works, ie: ignoring variables that are readonly) or whether it merely removes the value (no point keeping it, as the only way to get out of unset state is to give the var some new value) and marks it as unset. The differnce occurs when the variable has some other attribute. In the standard, there is only one that matters - export (as once readonly the var cannot be unset), so the difference is illustrated by this example, where there is no issue of scoping at all (no functions at all, just code and globals). Assuming that here, X is a totally unknown variable, never before referenced in any way, not imported from the environment, and not one of the vars defined by the shell as being set at init time. export X At this point, the variable X is given the export status, I think we all agree on that, though as it is still unset, it does not appear in the environment. X=string now X becomes set, and is added to the environment. unset X now X is unset again, and is removed from the environment. X=new-value OK, now we get to the question, X is set again at this point, but what about the environment, is it still exported or not? POSIX does not say, one way or the other as best I can see (though as always, the caveat about it being difficult to be sure what is not said, as there might always be somewhere that I did not think to look...) In our shell, we have export -x X which explicitly removes the export attribute from X, so "unset" does not need to do that (though we also have an option for unset as well), all it needs to do is remove the value, make the variable be unset. I see this as a clean, orthogonal, model of how things should behave. On the other hand, if you are of the view that "unset" must obliterate all memory of X ever existing, then the export atribute necessarily vanishes with it, and would need to be restored later if desired. This latter view would also allow the "local x; unset x" combination as having "unset x" just remove the newly created x and immediately restore the value that existed before "local x", whereas the "just remove the value" model does not admit that possibility. Re: Note: 0003729 We're talking how shells are implementing it, We shouldn't be, or no at that level of detail - the model that is implemened, or stated another way, the effect of the command, in various shells is something to be considered. How any of them actually makes that effect happen is irrelevant. this way mimics how compilers set up the stack frame for a function invocation, And in any case, I kind of doubt that any shell is implementing anything that way (at least for traditional compilers of traditional languages - if you're considering something like lisp on the other hand, maybe.) If a script inherits values with local a sequence like: unset foo; readonly foo; f() { local foo; foo=$IFS; }; should produce a name not found, or name not localizable error; since it would also inherit the "do not set this name" attribute of foo when trying to inherit the value. Definitely not "name not found", but an error, yes, and both bash and ash derived shells generate an error for this. bash on the "local" of a readonly var. We don't object to that, but the local variable (whose value has been saved) is still readonly si the attempt to assign to it fails. kre |
|
Re: Note: 0003727 It's my understanding a function definition is limited to a compound-command so the definition implicitly ends at the closing reserved word of the compound, be it a keyword like esac or a '}', rather than clutter scripts with explicit function ... endfunc type keyword pairs. That's not needed, in (I think all) ash derived shells, the syntax implemented is function_body : command /* Apply rule 9 */ | command redirect_list /* Apply rule 9 */ ; [edited in later: actually the alternate form with the redirect_list appended does not exist I think, that's handled by "command" already.] where anything that can be "command" is acceptable, including another function definitions, so f() g() h() echo hello works just fine, running f defines g, runnning g defines h, running h says "hello". There is no need for an endfunc or anything like it, that would only be needed if the syntax were more like function-body: list EndFunc or function-body: complete-command EndFunc as a list (or complete-command) is an open ended construct, "command" is not. My guess (pure guess) would be that it was not specified to allow "command" but only "compound-command", as allowing "command" that way makes a redirect at the end of the line ambiguous - just what does that mean, ie: in f() g() h() echo hello >&2 to what exactly is the redirect applied, just to when we run f, so g is defined with stdout redirected to stderr, but then h runs with the original stdout ? [ Also edited in later: That is, is that command equivalent to f() { g() { h() { echo hello; }; }; } >&2 or f() { g() { h() { echo hello >&2; }; }; } ? (The other apparent possibilities make no sense at all.) ] Our implementation binds the redirect to the echo in this case, so when we dump the function definitions, we see function h() { echo hello >&2; } (the "function" there is just giving the type of 'h', we do not implement a "function" reserved word.) |
|
Re: Note: 0003730 (my earlier note), our shell now implements (or will when I document & then commit it [added: but see below[) ... unset X Y X=foo func1() { local -x X Y echo in func1 X=${X-unset} Y=${Y-unset} X=bar Y=f1Y echo in func1 X=${X-unset} Y=${Y-unset} unset X echo in func1 X=${X-unset} Y=${Y-unset} } func2() { local -u X Y echo in func2 X=${X-unset} Y=${Y-unset} X=bar Y=f2Y echo in func2 X=${X-unset} Y=${Y-unset} unset X echo in func2 X=${X-unset} Y=${Y-unset} } func3() { local -x X=f3val Y=f3val echo in func3 X=${X-unset} Y=${Y-unset} X=bar Y=f3Y echo in func3 X=${X-unset} Y=${Y-unset} unset X echo in func3 X=${X-unset} Y=${Y-unset} } func4() { local -u X=f4val Y=f4val echo in func4 X=${X-unset} Y=${Y-unset} X=bar Y=f4Y echo in func4 X=${X-unset} Y=${Y-unset} unset X echo in func4 X=${X-unset} Y=${Y-unset} } echo Global X=${X-unset} Y=${Y-unset} func1 echo Global X=${X-unset} Y=${Y-unset} func2 echo Global X=${X-unset} Y=${Y-unset} func3 echo Global X=${X-unset} Y=${Y-unset} func4 echo Global X=${X-unset} Y=${Y-unset} which when executed says ... Global X=foo Y=unset in func1 X=foo Y=unset in func1 X=bar Y=f1Y in func1 X=unset Y=f1Y Global X=foo Y=unset in func2 X=unset Y=unset in func2 X=bar Y=f2Y in func2 X=unset Y=f2Y Global X=foo Y=unset in func3 X=f3val Y=f3val in func3 X=bar Y=f3Y in func3 X=unset Y=f3Y Global X=foo Y=unset in func4 X=f4val Y=f4val in func4 X=bar Y=f4Y in func4 X=unset Y=f4Y Global X=foo Y=unset Unless others have objections to that, it would be good to see others implement something similar (or better yet, the same.) We can leave aside the issue of what happens without the -u or -x flags (make that implementation defined) so we can all keep our current behaviour there. Edited in later... Actually, to avoid the obvious conflict with bash's declare -x and -u (which also apply to "local") I am going to rename my options to upper case and change -X to -I (inherit) (so -I and -U) (inherit or unset) |
|
Re: 3731 From line 75729, XCU 2.9.4, "Each redirection shall apply to all the commands within the compound command that do not explicitly override that redirection." Since the echo doesn't redirect stdout also, the effect is as if echo >&2 was stated by itself. As a function definition is a command also, it doesn't matter whether the echo is invoked via f, g, or h; the redirect applies unless the invoke has an overriding redirect; e.g f >&3; Still, that variety of definition doesn't appear to permit sequences like: f() { read var; echo "var=$var"; }; if a definition is limited to one simple command, as the POSIX grammar specifies it. Use of { would be a syntax error, I'd expect. |
|
Re: 0000767:0003731 I don't know if POSIX specifies whether "unset" unexports a variable, but if doesn't then it should, as it's the only way to unexport a variable there (and it does so in every shell (not when the variable has been declared as "local" in some shells but here I'm talking of POSIX sh scripts and POSIX has no "local")). A POSIX unexport can typically be written as unexport() { eval 'set -- "$1" ${'"$1"'+"${'"$1"'}"}' unset -v "$1" if [ "${2+set}" ]; then eval "$1=\$2" fi } That is unset and restore the previous value if any. (interesting to that discussion, that unexport function when called on a local variable in bash/mksh/yash cancels the effect of "local"). ksh-like shells typically have "typeset +x" for that (which is why using "local -x" for something different may not be a good idea as typeset/local -x is already used for something else in those shells). Now, I did say earlier that it didn't matter much that ash/bosh created the local variable as a copy of the one from the parent scope because users will typically set the initial value on the assumption that ash/bosh don't have "types"/"attributes", but I overlooked "export". To me the whole point of having "local" is so that I can write functions that can be used in any context and use local variables the way I want to use them. It's typically used for libraries of functions that is when the script code is written by separate and independent persons. You write: f() { local i for i do blah "$i" done } So that you have your own $i regardless of whether the code calling your function also uses a $i variable (that might be an array, integer or exported for whatever purpose). Before "local", one had to add prefixes to variables to reserve the namespace to avoid clashes. As in: f() { for _my_lib_i do...;} $ mksh -c 'export a; a() { local a; a=1; printenv a; }; a=0; a' 0 $ bash -c 'export a; a() { local a; a=1; printenv a; }; a=0; a' 1 $ zsh -c 'export a; a() { local a; a=1; printenv a; }; a=0; a' $ yash -c 'export a; a() { typeset a; a=1; printenv a; }; a=0; a' 0 $ posh -c 'export a; a() { local a; a=1; printenv a; }; a=0; a' 0 $ ksh93 -c 'export a; function a { typeset a; a=1; printenv a; }; a=0; a' $ dash -c 'export a; a() { local a; a=1; printenv a; }; a=0; a' 1 IMO, ksh93/zsh is the best approach there. "local" completely shadows the outer variable including its type/attributes. Dash at least is consistent. Neither value nor attributes affected by "local". bash is not very consistent with the way it handles other attributes. The mksh/yash approach is reminiscent of how the Bourne shell dealt with the environment variables it received on startup (behaviour that was changed by POSIX/Korn). When I declare a variable "local" so as to use it as I please, it's bad enough that that variable makes it to other functions I call (dynamic vs static scope), but I don't want it to also be passed in the environment of commands that are executed unless I explicitly request it. For instance if I have start_daemon() { local daemon="$1" local key key=$(cat /etc/secret/$daemon.key) && (umask 77 && printf '%s\n' "key=$key" > somewhere) && "$daemon" } I don't want that secret $key to be passed into the environment of the daemon (think sshd for instance) just because I called start_daemon in a context where there was a completely unrelated $key variable that was exported. It's similar with readonly. To me "readonly" is a bit pointless. I don't think I've ever used it in over 20 years of writing scripts. From what I can see, people use it for two different purposes: 1 as a misguided security feature, to prevent users from modifying some variables. That kind of thing never works. If restrictions have to be imposed over users, shells is generally not the right place to put them as the restrictions you apply to the shell doesn't extend automatically to the commands that you may run from that shell (the "env" command is an obvious case here) 2 as a development helper. A bit like "set -u" can help a developer spot typos and uses of uninitialised variables. You'd do "readonly pi=3.1416" to make sure that variable is not modified and we use the same value of $pi throughout the script and spot the cases (at development time, when testing) where some code attempts to modify it. Now, 2 clashes with with the notion that "local" would solve the namespace issue. If I do "readonly pi=3.1416", I can't use a library of functions that does: cos() { local pi=3.14159265359 echo "c($1 * $pi / 180)" | bc -l } Several shells allow shadowing readonly variables. That solves the problem above, but breaks the "security" measure to some extent. IMO, "readonly" would be best removed from the standard (deprecated, allowed as an extension but unspecified, so not to be used in POSIX scripts). It is far, far less useful a feature than "local". If anything, that shows it's going to be difficult to come up with a standard everyone agrees on. But even a "local" where all those considerations (initial value/attributes, dynamic vs static, interaction with unset, readonly, export, with functions called as var=value myfunction...) are left unspecified (like in the Debian policy which is a "standard" that specifies "local") would be better than nothing. |
|
Re: Note: 0003733 From line 75729, XCU 2.9.4, "Each redirection shall apply to all the commands within the compound command that do not explicitly override that redirection." Yes, I know that, but the relevant passage is actually XCu 2.9.5, lines 75855... Since the echo doesn't redirect stdout also, the effect is as if echo >&2 was stated by itself. Thanks, but I know how redirects work. You cannot explain the example I gave in any way from the standard, as it was a distinctly non-standard usage - I gave it just to explain why I believe that POSIX only permits a compound command as the body of a function definition (and that it most probably has nothing to do with needing any extra invented endfunc) not because it is in any wany useful that I can imagine (on the other hand being able to type real simple ones like ll() ls -l "$@" is nice for interactive use. I'd never put that in a script though, adding the { ;} is simple when you're only doing it once. Still, that variety of definition doesn't appear to permit sequences like: f() { read var; echo "var=$var"; }; Of course it does. A command is a simple command (just one of those), or a compound_command (maybe with a redirect) or a function definition. That example is just the compound_command case. See the grammar, lines 76015... And no, it is not a syntax error, not with POSIX function definitions, and not with ash function definitions either. |
|
Re: Note: 0003735 I don't know if POSIX specifies whether "unset" unexports a variable, Unset says that it is "removed from the environment" but that's just a side effect of being unset, any exported but unset variable does not appear in the environment. but if doesn't then it should, as it's the only way to unexport a variable there That, by itself is not necessarily a reason, as one could say that any variable that is exported should remain that way (this is leaving aside the interaction with local vars.) If that's not unreasonable then no way to unexport is needed. If that's not sufficient, we could add a new command of option to an existing command to correct this, if we can agree what it should be. However, I agree, everyone unexports with unset, and scripts expect that, (well, maybe, just removing the value so there is nothing to go in the environ is probably enough for most of them) so we should probably make it explicit. ksh-like shells typically have "typeset +x" for that (which is why using "local -x" for something different may not be a good idea as typeset/local -x is already used for something else in those shells). I did add local -x but it means the same thing as (I think) ksh93 and bash use it for ... mark the variable for export (essentially just saves doing an extra export command, so it is not all that useful.) And fortunately (or perhaps it was because I checked) neither I nor U are used as options for typeset in ksh93 or declare in bash. So they are both safely available. To me the whole point of having "local" is so that I can write functions that can be used in any context and use local variables the way I want to use them. That's one of my two common uses. The other is to make some modification to one of the well known (and much used) global vars (like PATH, TERM, IFS ...) and run some command which uses the var in question. Note "modify", not replace. Of course this can be done with the unsetting type of local, but it is easier with the inheriting type. Just as your use is easier with the unsetting type, but can also be achieved with the inheriting variety (at least if we ignore the shells that want to turn unset into some form of "pop" - that should be done, if it is needed at all, with a new command, or at the very least, an option to unset, not just a bare unset.) Before "local", one had to add prefixes to variables to reserve the namespace to avoid clashes. I don't think you need to convince anyone here of the usefulness or need for "local" - I don't think that was ever necessary, the problem (from what I can make out) always seems to have been that there was never agreement on what the sematics should be. And that makes it hard to define. IMO, ksh93/zsh is the best approach there. "local" completely shadows the outer variable including its type/attributes. Dash at least is consistent. Neither value nor attributes affected by "local". I think both of those have their uses, which is why I now have an option that allows for either type of local (for each local command, so some can be done way, and others the other). Further, if we can agree that -I and -U are the options to use for this, and get it more widely implemented (and include -x for export, where -U means unset & unexport, I don't think we really need an option for unset but keep exported if it was previously, that combination doesn't make a lot of sense, however, -Ux for unset it, and then export my new local var, does), then maybe we can find a way to at least got over this particular hurdle in a way that allows everyone to be happy.. When neither -U nor -I is specified, it would be implementation defined, so everyone's current local, and the scripts that use it, can just keep on working fine. No posix script will be broken by the requirement to use -I or -U, as there are no posix scripts currently that use local at all. When I declare a variable "local" so as to use it as I please, it's bad enough that that variable makes it to other functions I call (dynamic vs static scope), The other big issue, as I understand it. Personally, for the sh language, I think only dynamic scope makes much sense, that's just the kind of scripts that are generally used. When you write a function, you have no idea what the caller might be using, so you can't (without sheer luck, or maybe, probability and hope) avoid stepping on their variables, but you do know what functions that you call, so you should be able to easily avoid accidentally using var names that they want to access globally (assuming that we add local, and their local vars can be declared that way, so we can ignore the internal vars of functions we call.) But it is just way too useful to be able to define a function that modified (for example, PATH) and calls some other function in order to make that other function operate with a different than global PATH (and not just as simple as PATH=xxx:$PATH func the function would be doing work to figure out what mods the PATH it sets will get.) And yes, you can do this without using local., by manually saving the old path, modifying the global, and then restoring the saved value - while catching traps, so when executed interactively you don't accidentally ever get exited with PATH in the wrong state. but that is a lot of work that local PATH plus dynamic scoping make trivial. To me "readonly" is a bit pointless. I don't think I've ever used it in over 20 years of writing scripts. I have, but you're right, its use is rare (though I think PPID is specified as being readonly.) It can be fun though to break stuff by making OPTIND readonly, and stuff like that... If anything, that shows it's going to be difficult to come up with a standard everyone agrees on. But even a "local" where all those considerations [...] are left unspecified [...] would be better than nothing. Agreed, but I think we can do better than that. We just need to get it implemented first, in a way that we can agree upon. |
|
Re: 0000767:0003730 POSIX claims that "unset" removes the variable from the environment. So I would understand this as a requirement to also "unexport" a variable that was unset. Traditionally, the Bourne Shell did clear all attribute flag bits for a variable when it is unset, so the behavior of the Bourne Shell is to unexport a variable that was unset. |
|
Re: bugnote 3732 zsh has -U and -u (the latter also in ksh93) for something else already (see below). -N (null/not-set) seems to be free. We could also use -k (keep) and -K (not keep) [Edit: or -I/+I, -k/+k] zsh doc: -U For arrays (but not for associative arrays), keep only the first occurrence of each duplicated value. This may also be set for colon-separated special parameters like PATH or FIGNORE, etc. This flag has a different meaning when used with -f; see below. -u Convert the result to upper case whenever the parameter is expanded. The value is _not_ converted when assigned. This flag has a different meaning when used with -f; see above. [Edit] Other than that, I agree with everything you say in 0000767:0003737 |
|
Re Note: 0003738 So I would understand this as a requirement to also "unexport" a variable that was unset. I wouldn't, not from the language, "remove from the environment" means exactly that (no longer put in the environment) and not anything explicitly about the variable attributes. That is, if X is unset export X sets the export attribute on X, but does not put X in the environment, as X has no value X=foo gives X a value, and puts it in the environment (as it had that attribute.) unset X takes away the value, and consequently removes X from the environment. But does it also remove the attribute. In the implementations, yes, but in the standard, not really (at best it is ambiguous.) |
|
Re Note: 0003739 zsh has -U and -u Grunge. I should have looked there - if anything is going to have more options than ksh93, it is zsh. I see now that mksh has a -U (different meaning, of course) as well. So that one is out... I knew about -u, that one seems quite common (and its -l companion.) -N (null/not-set) seems to be free That looks like a reasonable choice. So, now that is what I am using (fortunately, I am only "just about" to commit this... It passed all our tests in the previous incarnation, I expect it will now too.) But now I have to fix the doc, again... Thanks. |
|
Oh, forgot to explicitly say, I am not clearing readonly, as long as we have dynamic scope, I think we need to keep it that way, and as I said, I think dynamic scope, at least as an available, and the default, option is important. It shouldn't really that readonly is always inherited, as as was said in note 0003735, readonly is very rare (and using it randomly breaks all kinds of stuff). |
|
Re: 0000767:0003738 2017-05-24 12:08:14 +0000, Austin Group Bug Tracker: [...] > That is, if X is unset > export X > sets the export attribute on X, but does not put X in the environment, as > X has no value > X=foo > gives X a value, and puts it in the environment (as it had that > attribute.) > unset X > takes away the value, and consequently removes X from the environment. > > But does it also remove the attribute. In the implementations, yes, but > in the standard, not really (at best it is ambiguous.) [...] issue7TC2> If neither -f nor -v is specified, name refers to a issue7TC2> variable; if a variable by that name does not exist, it is issue7TC2> unspecified whether a function by that name, if any, shall issue7TC2> be unset. That "exist" above would be worth clarified as well May the "a" function be unset in: unset a; a() { echo test; }; export a; unset a; a IOW, does that "a" variable "exist" after it has been marked for export but not assigned any value (is still unset)? That's relevant to this issue in that shells may want to treat "local" as an attribute like "export" (and to some extent, mksh, yash (and bash) do treat it as such wrt unset as their unset undoes one level of "local"ness) IMO, unset should *not* unset a function if a variable by the same name has been declared "local" (and not assigned any value). It should remove the "export" attribute (as that's what "unset" has been doing forever) but not the "local" /attribute/ (as we want to be able to have local variables with no value; and that means we need to convince the bash/mksh/yash maintainers). |
|
We can always avoid the unset var/func ambiguity by always using -v or -f. The rationale should be amended to make this clear, using unset without an option is not safe. And re the unexport question with unset, the "export" command spec specifically says that the "export attribute" is set, if unset was intended to clear that (which it most probably is, since everyone does that) then it should say so using similar wording. |
|
It seems that we should set up a list of shell implementations that should be looked at when cheking for new POSIX ideas. First, I believe that "yash" and "posh" should not be in that list because these implementations are too buggy or have unexplained deviations. "yash" is able to run configure, but "posh" is so buggy that it will not run a "configure" script at all. So a useful list should have: bash bosh dash mksh ksh88 ksh93 I am not sure whether we should add "zsh" to this list as zsh by default implements many intentional deviations to the standard. Note that I know of no way of doing zsh / POSIX related tests that work except for copying zsh to /tmp/sh and then test with /tmp/sh, as I could not find a better way to make zsh compliant enough to be able to run a "configure" script. It would be nice, if reports in future would check the shell implementations from the agreed list. If you like to add another shell to that list, please give a URL for a tarball that includes a portable version of the source. If you believe we need a more in depth discussion for this topic, please use the mailing list, as this is off-topic for the bug. |
|
Re: http://austingroupbugs.net/view.php?id=767#c3730 > Re: Note: 0003724 > > > If calling "unset" pops such a variable instance, you would see the > > previous instance that may be the instance from another function call > > instead of the global value and I doubt that this is what users like > > to see. > > This is just the dynamic vs. static scoping issue. > > Not really, those are separate issues, Not given what Joerg wrote. The "instance from another function call instead of the global value" is exactly the dynamic vs. static scoping issue. It's the behavior of "pops such a variable instance" you go on to discuss, but that's not what I was referring to. > the real question is whether > "unset" actually destroys all knowledge of the variables named (assuming > it works, ie: ignoring variables that are readonly) or whether it > merely removes the value (no point keeping it, as the only way to > get out of unset state is to give the var some new value) and marks > it as unset. And whether or not "destroying all knowledge of the variables" removes the "shadowing" that a scope stack provides. > The differnce occurs when the variable has some other attribute. > > In the standard, there is only one that matters - export (as once > readonly the var cannot be unset), so the difference is illustrated > by this example, where there is no issue of scoping at all (no > functions at all, just code and globals). > > Assuming that here, X is a totally unknown variable, never before > referenced in any way, not imported from the environment, and not > one of the vars defined by the shell as being set at init time. > > export X > > At this point, the variable X is given the export status, I think > we all agree on that, though as it is still unset, it does not > appear in the environment. > > X=string > > now X becomes set, and is added to the environment. Correct, since the standard states that a variable is not set unless it has been assigned a value. That's why I support the view that `local x' by itself does not set a variable x: because it hasn't been assigned a value. The value inheritance question then becomes relevant. > Re: Note: 0003729 > > We're talking how shells are implementing it, > > We shouldn't be, or no at that level of detail - the model that is > implemened, or stated another way, the effect of the command, in > various shells is something to be considered. How any of them > actually makes that effect happen is irrelevant. Agreed -- the standard has never cared about how things are implemented, or whether a particular feature is "easy" to implement. Since shell internals differ so wildly, how could it? > If a script inherits values with local a sequence like: > unset foo; readonly foo; > f() { local foo; foo=$IFS; }; > > should produce a name not found, or name not localizable error; > since it would also inherit the "do not set this name" attribute > of foo when trying to inherit the value. > > Definitely not "name not found", but an error, yes, and both bash > and ash derived shells generate an error for this. bash on the > "local" of a readonly var. We don't object to that, but the local > variable (whose value has been saved) is still readonly si the attempt > to assign to it fails. I actually make that a special case for `readonly'. The reasoning, based on ancient bug reports, is that a variable is made readonly for a reason, and a function -- running in the same execution context as the calling shell - should not be able to override that. |
|
Re: Note: 0003746 > Not really, those are separate issues, Not given what Joerg wrote. OK, if you were commenting on just that point. But that is a little strange, as that issue in his example was (I believe) largely irrelevant, the point he was making, as I saw it (and Joerg can correct me if I am mistaken) was that he doubts that users will want to see a new value just "magically" appear (regardless of from where it comes.) That is, I would expect, and I think most average users would expect, that in any situation, with this code ... unset X && test -n "${X+set}" && abort if abort is ever executed - if there is any way to make abort ever be executed, then the shell is broken - badly broken. Note that if X is readonly, the "unset" fails, so the test never gets executed. That's the only way that X should be able to appear as set after unset X, and before it is assigned a new value. When a variable is unset, the one thing everyone should be able to count on, is that it is unset (whatever happens to other attributes, etc.)\ If you need an "unlocal" command, add one. It could even be unset with some strange option given. Just not plain unset. And whether or not "destroying all knowledge of the variables" removes the "shadowing" that a scope stack provides. That was what I meant by that phrase, yes. And that's what I don't think unset should do (or not in any standard usage.) That's why I support the view that `local x' by itself does not set a variable x: Of course not, is there anyone claimimg differently? because it hasn't been assigned a value. but here you are venturing into the unknown. How do you know that? In the sequence X=foo local X X has clearly been assigned a value, you can see it right there. Of course if your "because it hasn't been assigned" was reworded as "because it is not being assigned" then I'd agree. You only get a different opinion if you believe that "local" is creating something new, rather than arrannging to save, and later restore, the value of X. I prefer local to be just like the existing export, and readonly, they set attributes on a variable, they do not create them. This model also does away with scoping questions completely. There only is one scope (just as now in POSIX) - "local" is just doing the equivalent of save_X="${X}" # possibly followed by: unset X and arranging for "return" (including the implicit one at func end, and any exit of the func caused by an abort that does not cause the shell to exit) to do X="${save_X}" but also including attributes, and using anonymous save variables. I think you'd agree that if in X=foo export X X was now unset, no-one would be happy. Why should there be a difference for local? The question of whether after "local X" there should be an implicit "unset X" is not material here (and it seems like we have a reasonable solution to avoid that issue.) |
|
Re: http://austingroupbugs.net/view.php?id=767#c3747 > That's why I support the view that `local x' > by itself does not set a variable x: > > Of course not, is there anyone claimimg differently? I guess it depends on what people mean by things like a "pushed instance" or "local variant". It creates a new object, that much seems clear. But does that new object have a value? > > because it hasn't been assigned a value. > > but here you are venturing into the unknown. How do you know that? Because, unless it has already been declared at the same local scope, it creates a new object. That's the essential question. > > In the sequence > > X=foo > local X > > X has clearly been assigned a value, you can see it right there. OK. How many `instances' of X should now exist, and what are their values? Should there be an `X' at the global scope, and a separate instance of X at the local scope? We all agree that subsequent assignments to X should not affect the copy at the global scope, which should remain with value "foo". If you treat them as different objects, then the local instance of X is unset because it hasn't been assigned a value. This is the crux of the value inheritance question. > > Of course if your "because it hasn't been assigned" was reworded > as "because it is not being assigned" then I'd agree. > > You only get a different opinion if you believe that "local" is creating > something new, rather than arrannging to save, and later restore, the > value of X. I prefer local to be just like the existing export, and > readonly, they set attributes on a variable, they do not create them. Yes. That's the question. > > This model also does away with scoping questions completely. There > only is one scope (just as now in POSIX) - "local" is just doing the > equivalent of > save_X="${X}" # possibly followed by: unset X > > and arranging for "return" (including the implicit one at func end, > and any exit of the func caused by an abort that does not cause the > shell to exit) to do > > X="${save_X}" > > but also including attributes, and using anonymous save variables. Not really. You're just implicitly resolving the question in favor of dynamic scoping. If function A runs `local X', runs X=4, and calls function B, does B see a set variable X, and, if so, what is its value? If you say `yes' and `4', you've chosen dynamic scoping. > I think you'd agree that if in > > X=foo > export X > > X was now unset, no-one would be happy. Nobody would expect X to be unset after running that. It's irrelevant. It's the same question as if I were to run `export X', where X was not previously set: would I expect X to suddenly have a value? > Why should there be a difference > for local? Because local has `shadowing' effects on any variable with the same name in a previous scope. It's those effects we're talking about. We have differing views. |
|
@kre: mksh already has -U (for “unsigned integer”), so “local -U” won’t work. (There also needs to be a way to specify that “locals are unset by default” be the standard, as anything else would break tons of existing scripts. Perhaps just leave it unspecified in the standard?) |
|
Re: 0000767:0004043 The -U being already taken by zsh and mksh was already noted in 0000767:0003741 As already noted, specifying that "locals are unset by default" would break tons of existing BSD scripts. Not doing it would break tons of bash/pdksh/zsh scripts. But in any case, that would be non-POSIX scripts as POSIX currently doesn't specify "local". POSIX could leave it unspecified whether "local" unsets the variable or not, except that f() { local var unset var ... } doesn't work in mksh and yash (as discussed above, something one may consider a bug/misfeature). That's why @kre was suggesting introducing new options so users can specify which behaviour they want (unsetting or keeping), with the default left unspecified so as not to break existing platform-specific scripts. |
|
What we ended up using was -N ("nullify" or "not inherited" or whatever you choose to have it represent...) and -I for inherited. Those don't seem to be used anywhere. And yes, if this ever makes it into the standard (and I hope it does, as everyone has some form of local command these days I think) leaving the default as unspecified is the right approach. I have also implemented (but not distributed, as the demand for it did not appear to exist) a -S option, for static, which seems to work quite well (but which, as I said, no-one seems to want). ksk93 (at least) has a -S option as well, with, I think, a reasonably compatible meaning. |
|
Also, the idea that "unset" should (ever) mean "unlocal" is just mind boggling to me - an option to unset perhaps, but if an unlocal, or upvar, or similar command is needed - just add it, don't pervert something that already has meaning. That is, I would summit, that anywhere, that is in any context whatever, the sequence unset VAR && printf '%s\n' "${VAR:-OK}" must *always* say "OK", or generate an error (if VAR is readonly) with no exceptions permitted. And incidentally, wrt readonly vars, what should happen when one is made readonly is an interesting question (I mean here, what should happen, not what we should write in a standard if it were to be written today). I used to be of the opinion that once a var is readonly, it must stay that way, and making it local in a function should be essentially a no-op (or perhaps an error - especially in shells where unset is the default for new local vars). More recently I am shifting to the view that the readonly attribute on the previous instance of the var should be irrelevant. While the local version is in scope, it should make no difference at all what the attributes of the hidden one was. Similarly if an exported var is made local, but not then exported, the export attribute should be lost while the local var remains in scope. Of course, once the local var vanishes, the previous one is restored, along with all attributes, unchanged. I'd also dispute the "break tons of existing scripts" assertion (for making the default for local either -I or -N to use the NetBSD sh options). There are certainly scripts that would be affected, but the vast majority that I have seen simply don't care - they assign a value to local vars before any attempt is made to use them (partly I expect, because many sh programmers believe they are writing C ...(or similar)) |
|
> In mksh and yash, unset doesn't unset, it pops the variable from > the stack. No, it removes the variable from the set of local variables associated with the current scope. There’s no stack. There’s a chain of “struct env” that all point to their respective parent, and that all have local variables associated with them (e.g. assignments before commands not affecting the execution environment are implemented that way). The “unset” builtin, in mksh, strictly affects the current scope only, so it removes the local variable, unhiding the outer one of the same name (which is NOT the same variable). mksh also changed to keep assignments for functions in the meantime: $ mksh -c 'f() { local a; echo "${a-unset}"; a=2; }; a=0; a=1 f; echo "$a"' 1 0 I *know* changing the behaviour of local in mksh *will* break tons of scripts, I wrote many of them myself, not all though, and it’s guaranteed for scripts, so I assume others would also have made use of this. So, if local should ever get standardised, make 'local x' unspecified and have only 'local x=' and 'local x=$x' specified. |
|
This was discussed during the 2022-08-08 conference call. Since there is clear disagreement about the scope of local variables, it is not clear that consensus can be reached. Therefore this bug is rejected. If a consensus is reached, please submit a new bug with proposed wording changes. |
Date Modified | Username | Field | Change |
---|---|---|---|
2013-10-11 02:02 | dwheeler | New Issue | |
2013-10-11 02:02 | dwheeler | Status | New => Under Review |
2013-10-11 02:02 | dwheeler | Assigned To | => ajosey |
2013-10-11 02:02 | dwheeler | Name | => David A. Wheeler |
2013-10-11 02:02 | dwheeler | Section | => XCU 2.14 (local) |
2013-10-11 02:02 | dwheeler | Page Number | => 2374 |
2013-10-11 02:02 | dwheeler | Line Number | => 75650 |
2013-10-11 08:31 | geoffclare | Relationship added | related to 0000465 |
2013-10-15 22:47 | dwheeler | Note Added: 0001912 | |
2013-10-17 05:52 | ranjit | Note Added: 0001924 | |
2013-10-20 05:00 | shware_systems | Note Added: 0001937 | |
2013-11-14 16:08 | geoffclare | Relationship added | related to 0000771 |
2016-07-05 09:35 | joerg | Note Added: 0003285 | |
2016-12-01 16:51 | eblake | Relationship added | related to 0001025 |
2017-05-19 21:39 | mirabilos | Note Added: 0003699 | |
2017-05-19 22:12 | stephane | Note Added: 0003701 | |
2017-05-22 11:19 | joerg | Note Added: 0003702 | |
2017-05-22 11:20 | joerg | Note Edited: 0003702 | |
2017-05-22 11:20 | joerg | Note Edited: 0003702 | |
2017-05-22 11:21 | joerg | Note Edited: 0003702 | |
2017-05-22 19:38 | stephane | Note Added: 0003703 | |
2017-05-22 23:06 | shware_systems | Note Added: 0003704 | |
2017-05-23 03:41 | kre | Note Added: 0003705 | |
2017-05-23 10:47 | shware_systems | Note Added: 0003706 | |
2017-05-23 11:07 | shware_systems | Note Edited: 0003706 | |
2017-05-23 12:06 | stephane | Note Added: 0003707 | |
2017-05-23 13:08 | joerg | Note Added: 0003708 | |
2017-05-23 13:18 | joerg | Note Added: 0003709 | |
2017-05-23 13:19 | joerg | Note Edited: 0003709 | |
2017-05-23 13:41 | chet_ramey | Note Added: 0003710 | |
2017-05-23 13:47 | joerg | Note Added: 0003711 | |
2017-05-23 13:47 | chet_ramey | Note Added: 0003712 | |
2017-05-23 13:51 | joerg | Note Added: 0003713 | |
2017-05-23 14:00 | chet_ramey | Note Added: 0003714 | |
2017-05-23 14:05 | chet_ramey | Note Added: 0003715 | |
2017-05-23 14:11 | stephane | Note Added: 0003716 | |
2017-05-23 14:14 | chet_ramey | Note Added: 0003717 | |
2017-05-23 14:17 | joerg | Note Added: 0003718 | |
2017-05-23 14:19 | stephane | Note Added: 0003719 | |
2017-05-23 14:26 | chet_ramey | Note Added: 0003720 | |
2017-05-23 14:29 | chet_ramey | Note Added: 0003721 | |
2017-05-23 14:32 | chet_ramey | Note Added: 0003722 | |
2017-05-23 14:45 | joerg | Note Added: 0003723 | |
2017-05-23 15:16 | chet_ramey | Note Added: 0003724 | |
2017-05-23 15:29 | stephane | Note Added: 0003725 | |
2017-05-23 16:34 | shware_systems | Note Added: 0003726 | |
2017-05-23 17:28 | shware_systems | Note Added: 0003727 | |
2017-05-23 17:43 | chet_ramey | Note Added: 0003728 | |
2017-05-23 18:49 | shware_systems | Note Added: 0003729 | |
2017-05-23 22:54 | kre | Note Added: 0003730 | |
2017-05-23 23:38 | kre | Note Added: 0003731 | |
2017-05-23 23:48 | kre | Note Edited: 0003731 | |
2017-05-23 23:57 | kre | Note Edited: 0003731 | |
2017-05-24 00:06 | kre | Note Added: 0003732 | |
2017-05-24 00:18 | kre | Note Edited: 0003732 | |
2017-05-24 00:40 | kre | Note Edited: 0003732 | |
2017-05-24 07:31 | shware_systems | Note Added: 0003733 | |
2017-05-24 09:25 | stephane | Note Added: 0003735 | |
2017-05-24 09:53 | kre | Note Added: 0003736 | |
2017-05-24 10:37 | kre | Note Added: 0003737 | |
2017-05-24 10:38 | kre | Note Edited: 0003737 | |
2017-05-24 10:56 | joerg | Note Added: 0003738 | |
2017-05-24 11:24 | stephane | Note Added: 0003739 | |
2017-05-24 11:26 | stephane | Note Edited: 0003739 | |
2017-05-24 11:30 | stephane | Note Edited: 0003739 | |
2017-05-24 12:08 | kre | Note Added: 0003740 | |
2017-05-24 12:23 | kre | Note Added: 0003741 | |
2017-05-24 12:27 | kre | Note Added: 0003742 | |
2017-05-24 12:42 | stephane | Note Added: 0003743 | |
2017-05-24 12:59 | kre | Note Added: 0003744 | |
2017-05-24 13:02 | joerg | Note Added: 0003745 | |
2017-05-24 13:09 | joerg | Note Edited: 0003745 | |
2017-05-24 13:20 | joerg | Note Edited: 0003745 | |
2017-05-24 13:26 | chet_ramey | Note Added: 0003746 | |
2017-05-24 13:41 | joerg | Note Edited: 0003745 | |
2017-05-24 16:20 | kre | Note Added: 0003747 | |
2017-05-24 20:10 | chet_ramey | Note Added: 0003748 | |
2017-12-14 16:44 | eblake | Relationship added | related to 0001065 |
2018-06-19 15:45 | mirabilos | Note Added: 0004043 | |
2018-06-19 16:32 | stephane | Note Added: 0004044 | |
2018-06-19 16:33 | stephane | Note Edited: 0004044 | |
2018-06-19 17:25 | kre | Note Added: 0004045 | |
2018-06-19 17:40 | kre | Note Added: 0004046 | |
2021-07-28 16:53 | mirabilos | Note Added: 0005417 | |
2022-08-08 15:15 | Don Cragun | Interp Status | => --- |
2022-08-08 15:15 | Don Cragun | Note Added: 0005927 | |
2022-08-08 15:15 | Don Cragun | Status | Under Review => Closed |
2022-08-08 15:15 | Don Cragun | Resolution | Open => Rejected |