Viewing Issue Advanced Details
1385 [1003.1(2016)/Issue7+TC2] System Interfaces Objection Error 2020-07-29 10:44 2020-07-29 10:44
geoffclare
 
normal  
New  
Open  
   
Geoff Clare
The Open Group
unlink()
2197
70220
---
unlink() text needs updating to account for mmap()
The mmap() description says:
The mmap() function shall add an extra reference to the file associated with the file descriptor fildes which is not removed by a subsequent close() on that file descriptor. This reference shall be removed when there are no more mappings to the file.
I believe the intention of this is that when unlink() removes the last link to the file, the space it occupies is not freed until no process has the file open and no process has it mapped. However, this is not reflected in the description of unlink(), which only refers to processes having the file open.
Change:
When the file's link count becomes 0 and no process has the file open, the space occupied by the file shall be freed and the file shall no longer be accessible. If one or more processes have the file open when the last link is removed, the link shall be removed before unlink() returns, but the removal of the file contents shall be postponed until all references to the file are closed.
to:
When the file's link count becomes 0 and no process has a reference to the file via an open file descriptor or a memory mapping (see [xref to mmap()]), the space occupied by the file shall be freed and the file shall no longer be accessible. If one or more processes have such a reference to the file when the last link is removed, the link shall be removed before unlink() returns, but the removal of the file contents shall be postponed until there are no such references to the file.

There are no notes attached to this issue.




Viewing Issue Advanced Details
1384 [1003.1(2016)/Issue7+TC2] Shell and Utilities Objection Omission 2020-07-29 09:02 2020-07-29 09:02
geoffclare
 
normal  
New  
Open  
   
Geoff Clare
The Open Group
2.12 Shell Execution Environment
2382
76195-76202
---
Subshell of an interactive shell is effectively non-interactive
The standard is missing a statement to the effect that a subshell of an interactive shell behaves as a non-interactive shell (despite the fact that $- and "set +o" indicate it is interactive).

For example, in all shells I tried, the subshell in the following is terminated by SIGTERM. (Some wrote "Terminated" between the two lines, others didn't, but the $? value in all of them indicated termination with SIGTERM and none of them wrote "foo".)
$ (pid=$(sh -c 'echo $PPID'); kill -s TERM "$pid"; echo foo)
$ echo $?
If the subshell was behaving as per the standard's requirements for an interactive shell, the SIGTERM would be ignored, the echo would be executed, and $? would be 0.

The current statement about traps in 2.12 is also out of date with respect to the description of the trap utility.
On page 2382 line 76195 section 2.12 Shell Execution Environment,
after applying bug 1247 change:
A subshell environment shall be created as a duplicate of the shell environment, except that traps that are not being ignored shall be set to the default action.
to:
A subshell environment shall be created as a duplicate of the shell environment, except that:
  • Unless specified otherwise (see [xref to trap]), traps that are not being ignored shall be set to the default action.

  • If the shell is interactive, the subshell shall behave as a non-interactive shell in all respects other than the expansion of the special parameter '-' and the output of <tt>set -o</tt> and <tt>set +o</tt>, which shall continue to indicate that it is interactive.

There are no notes attached to this issue.




Viewing Issue Advanced Details
1383 [Issue 8 drafts] System Interfaces Editorial Enhancement Request 2020-07-25 11:24 2020-07-27 14:40
dannyniu
 
normal  
New  
Open  
   
DannyNiu/NJF
Fork
878
29985
Make "Application Usage" less confusing.
The "Application Usage" section has new content that's
added in response to Bug 62, but right now it has a
confusing (at least on the first sight) wording that's
unfriendly to new readers not familiar to the modification
made to fork.

Specifically, it says:

> processing performed in the child before fork() returns

On the first sight, it seems nonsensical as child doesn't
even exist before fork returns, however, when reminded of
pthread_atfork handlers, it makes sense.
Change

> processing performed in the child before fork() returns

To

> processing performed in the child by the pthread_atfork()
> handlers before fork() returns
Notes
(0004900)
geoffclare   
2020-07-27 09:07   
This change was not made as a result of bug 0000062, it is from the unrelated bug 0001114.

Where the new text refers to "processing performed in the child before fork() returns" it means everything that the implementation of fork() does in the child between the point where it creates the child process and the point where it returns in the child process. This includes any processing related to the bullet list on lines 29903-29957.

Rather than adding a reference to pthread_atfork() (which would incorrectly imply that the text only applies to processing performed by atfork handlers) we should make it clear that the text applies to both fork() and _Fork().
(0004901)
geoffclare   
2020-07-27 09:13   
Suggested change...

On page 878 line 29980 section fork(), change:
When a multi-threaded process calls fork(), ...
to:
When a multi-threaded process calls fork() or _Fork(), ...

On page 878 line 29985 section fork(), change:
the processing performed in the child before fork() returns
to:
the processing performed in the child before fork() or _Fork() returns in the child
(0004902)
shware_systems   
2020-07-27 14:40   
Upon rereading 1114, I'm more inclined second change in Note 4901 should start 'any processing', not 'the processing', to be inclusive of any implementations that set up the child's data space entirely before handing it off to the process scheduler, where this is plausible. Use of 'the' implies there will always be post handoff manipulations.




Viewing Issue Advanced Details
1382 [1003.1(2016)/Issue7+TC2] System Interfaces Editorial Clarification Requested 2020-07-24 15:16 2020-07-24 15:39
tydeman
 
normal  
New  
Open  
   
Fred J. Tydeman
Tydeman Consulting
2018 edition of Issue 7 of Base Specifications
XSH: system interfaces: asin()
603 (or 651 of 3952)
20997
---
Ambiguous statement in asin() and other math functions
In looking at asin(), I am confused by the wording in line 20997
"If x is not returned,....."
Which x is it talking about? The one on line 20995 or any x?

Is line 20997 supposed to be a continuation of line 20996?

If yes, but it cannot be because of MX vs MXX shading,
then line 20997 should be changed to
"If a subnormal result is not returned,...."

I believe the same problem exists for many other math functions: at least:
asinh, atan, atanh, expm1, log1p, sin, sinh, tan, tanh
Clean up the wording. nextafter() might be a way to show both MX and MXX shading as continuous.
Notes
(0004899)
geoffclare   
2020-07-24 15:39   
This text came from bug 0000068 where it is shown as one paragraph:

    [MX] If x is subnormal, a range error may occur[/MX] [MXX] and x
    should be returned.[/MXX]
    [MX] If x is not returned, asin(), asinf(), and asinl() shall
    return an implementation-defined value no greater in magnitude
    than DBL_MIN, FLT_MIN, and LDBL_MIN, respectively.[/MX]

So the paragraph break is an editorial mistake in applying bug 68. If the paragraph break is changed to a line break, the result would look like the current change from MX to MXX on the previous line, where the shading is continuous.

Changing to "If a subnormal result is not returned,...." won't work because the standard allows a different subnormal (implementation-defined) value to be returned.

The other cases are presumably also all the result of incorrectly applied bug 68 edits.




Viewing Issue Advanced Details
1381 [1003.1(2016)/Issue7+TC2] System Interfaces Objection Omission 2020-07-23 09:34 2020-07-23 09:34
geoffclare
 
normal  
New  
Open  
   
Geoff Clare
The Open Group
ftruncate()
980
33344
---
ftruncate() missing EINVAL from POSIX.1-1996
POSIX.1-1996 specified the following error for ftruncate():
[EINVAL] The fildes argument does not refer to a file on which this operation is possible.
This was omitted when POSIX.1-1996 and SUSv2 were merged to form POSIX.1-2001/SUSv3.

For consistency, truncate() should also have an equivalent EINVAL error.
On page 980 line 33344 section ftruncate() EINVAL, change:
The length argument was less than 0.
to:
The length argument is less than 0 or the fildes argument refers to a file on which this operation is not possible (for example, a pipe, FIFO or socket).

On page 2178 line 69765 section truncate() EINVAL, change:
The length argument was less than 0.
to:
The length argument is less than 0 or the path argument refers to a file, other than a directory, on which this operation is not possible (for example, a FIFO or socket).

(The editors may also want to rearrange both ERRORS sections into alphabetical order.)
There are no notes attached to this issue.




Viewing Issue Advanced Details
1380 [1003.1(2016)/Issue7+TC2] Base Definitions and Headers Objection Clarification Requested 2020-07-22 14:23 2020-07-22 14:23
geoffclare
 
normal  
New  
Open  
   
Geoff Clare
The Open Group
3.192 Hard Link, 3.208 Link
64, 66
1891, 1951
---
Definitions of Link and Hard Link don't match some usage
The term "link" is defined in XBD 3.208 as a synonym for "directory entry", and "hard link" is defined in XBD 3.192 is "The relationship between two directory entries that represent the same file"; it also says that a hard link is "the result of an execution of the ln utility (without the -s option) or the link() function".

This does not match how these terms are often used in the standard. There are many places where it uses "link" to refer to both hard links and symbolic links, e.g. in this paragraph on the fstatat() page:
The lstat() function shall be equivalent to stat(), except when path refers to a symbolic link. In that case lstat() shall return information about the link, while stat() shall return information about the file the link references.
(In the second sentence both uses of "the link" do not fit the definition; they are being used to refer back to "symbolic link" in the previous sentence.)

Also, by a strict reading of the definition, the term "hard link" should be used only in contexts where a file has multiple links, but the standard uses it to refer to cases that include the possibility of a file with only one directory entry, e.g. on the <sys/stat.h> page where st_nlink is described as the "number of hard links".

There are, of course, also places where the terms are used correctly (mainly in text that dates back to before symbolic links were added).

There are two ways the problem could be fixed:

1. Keep the current definitions and change the text in places where it misuses the terms.

2. Change the definitions, and the text in places where changing the definitions changed the meaning.

The proposed changes do the latter. This is mainly because the current definitions are at odds with the way "link" and "hard link" are commonly used outside the context of the standard. I believe changing the definitions would make the text more natural and easier to understand.
On page 53 line 1648 section 3.130 Directory Entry, change:
Directory Entry (or Link)
to:
Directory Entry (or Hard Link)

On page 56 line 1697 section 3.144 Empty Directory, change:
A directory that contains, at most, directory entries for dot and dot-dot, and has exactly one link to it (other than its own dot entry, if one exists), in dot-dot. No other links to the directory may exist.
to:
A directory that contains, at most, directory entries for dot and dot-dot, and has exactly one hard link to it other than its own dot entry (if one exists), in dot-dot. No other hard links to the directory can exist.

On page 64 line 1891 section 3.192 Hard Link, change:
The relationship between two directory entries that represent the same file; see also Section 3.130 (on page 53). The result of an execution of the ln utility (without the -s option) or the link() function.
to:
See Directory Entry in Section 3.130 (on page 53). A file can have multiple hard links as a result of an execution of the ln utility (without the -s option) or the link() function.

On page 66 line 1951 section 3.208 Link, change:
See Directory Entry in Section 3.130 (on page 53).
to:
In the context of the file hierarchy, either a hard link or a symbolic link.

In the context of the c99 utility, the action performed by the link editor (or linker).

<small>Note:
The c99 utility is defined in detail in the Shell and Utilities volume of POSIX.1-2017.</small>

On page 235 line 7917 section <errno.h>, change:
[EMLINK] Too many links.
to:
[EMLINK] Too many hard links.

On page 236 line 7967 section <errno.h>, change:
[EXDEV] Cross-device link.
to:
[EXDEV] Improper hard link.

On page 413 line 14030 section <tar.h> LNKTYPE, change:
Link.
to:
Hard link.


Cross-volume changes to XSH ...

On page 484 line 16712 section 2.3 Error Numbers (EMLINK), change:
Too many links.
to:
Too many hard links.

On page 488 line 16865 section 2.3 Error Numbers (EXDEV), change:
Improper link. A link to a file on another file system was attempted.
to:
Improper hard link. Creation of a hard link to a file on another file system was attempted.

On page 965 line 32816 section fstatat(), change:
st_nlink shall be set to the number of links to the file
to:
st_nlink shall be set to the number of hard links to the file

On page 965 line 32823 section fstatat(), change:
the number of (hard) links to the symbolic link
to:
the number of hard links to the symbolic link

On page 1243 line 41492 section link(), change:
link one file to another file
to:
hard link one file to another file

On page 1243 line 41500,41503,41507,41525,41531 section link(), change:
a new link
to:
a new hard link

On page 1244 line 41545 section link(), change:
The number of links
to:
The number of hard links

On page 1244 line 41568 section link() EXDEV, change:
The link named by path2 and the file named by path1 are on different file systems and the implementation does not support links between file systems
to:
The file named by path1 and the directory in which the directory entry named by path2 is to be created are on different file systems and the implementation does not support hard links between file systems.

On page 1245 line 41590 section link() EXAMPLES, change:
Creating a Link to a File

The following example shows how to create a link to a file
to:
Creating a Hard Link to a File

The following example shows how to create an additional hard link to a file

On page 1245 line 41599 section link() EXAMPLES, change:
Creating a Link to a File Within a Program

In the following program example, the link() function links the
to:
Creating a Hard Link to a File Within a Program

In the following program example, the link() function hard links the

On page 1246 line 41617 section link() APPLICATION USAGE, change:
Some implementations do allow links between file systems.
to:
Some implementations do allow hard links between file systems.

On page 1246 line 41621 section link() RATIONALE, change:
Linking to a directory
to:
Creating additional hard links to a directory

On page 1246 line 41625 section link() RATIONALE, change:
allow linking of files on different file systems
to:
allow hard linking of files on different file systems

On page 1246 line 41627 section link() RATIONALE, change:
The exception for cross-file system links is intended to apply only to links that are programmatically indistinguishable from ``hard'' links.
to:
The exception for cross-file system hard links is intended to apply only to links that are programmatically indistinguishable from traditional hard links.

On page 1246 line 41634 section link() RATIONALE, change:
The AT_SYMLINK_FOLLOW flag allows for implementing both common behaviors of the link() function. The POSIX specification requires that if path1 is a symbolic link, a new link for the target of the symbolic link is created. Many systems by default or as an alternative provide a mechanism to avoid the implicit symbolic link lookup and create a new link for the symbolic link itself.

Earlier versions of this standard specified only the link() function, and required it to behave like linkat() with the AT_SYMLINK_FOLLOW flag. However, historical practice from SVR4 and Linux kernels had link() behaving like linkat() with no flags, and many systems that attempted to provide a conforming link() function did so in a way that was rarely used, and when it was used did not conform to the standard (e.g., by not being atomic, or by dereferencing the symbolic link incorrectly). Since applications could not rely on link() following links in practice, the linkat() function was added taking a flag to specify the desired behavior for the application.
to:
Earlier versions of this standard specified only the link() function, and required it to behave like linkat() with the AT_SYMLINK_FOLLOW flag. However, historical practice from SVR4 and Linux kernels had link() behaving like linkat() with no flags, and many systems that attempted to provide a conforming link() function did so in a way that was rarely used, and when it was used did not conform to the standard (e.g., by not being atomic, or by dereferencing the symbolic link incorrectly). Since applications could not rely on link() following symbolic links in practice, the linkat() function was added taking a flag to specify the desired behavior for the application.

On page 1816 line 58819 section rename(), change:
If the old argument points to the pathname of a file that is not a directory, the new argument shall not point to the pathname of a directory. If the link named by the new argument exists, it shall be removed and old renamed to new. In this case, a link named new shall remain visible to other threads throughout the renaming operation and refer either to the file referred to by new or old before the operation began. Write access permission is required for both the directory containing old and the directory containing new.

If the old argument points to the pathname of a directory, the new argument shall not point to the pathname of a file that is not a directory. If the directory named by the new argument exists, it shall be removed and old renamed to new. In this case, a link named new shall exist throughout the renaming operation and shall refer either to the directory referred to by new or old before the operation began. If new names an existing directory, it shall be required to be an empty directory.
to:
If the old argument names a file that is not a directory and the new argument names a directory, or old names a directory and new names a file that is not a directory, or new names a directory that is not empty, rename() shall fail. Otherwise, if the directory entry named by new exists, it shall be removed and old renamed to new. In this case, a directory entry named new shall remain visible to other threads throughout the renaming operation and refer either to the file referred to by new or old before the operation began.

(Note that the sentence about write access permission is intentionally
dropped as it duplicates line 58834.)

On page 1816 line 58837 section rename(), change:
If the link named by the new argument exists and the file's link count becomes 0
to:
If the new argument names an existing file and the file's link count becomes 0

On page 1817 line 58871 section rename() EEXIST, change:
The link named by new is a directory that is not an empty directory.
to:
The new argument names a directory that is not empty.

On page 1818 line 58883 section rename() ENOENT, change:
The link named by old does not name an existing file
to:
The old argument does not name an existing file

On page 1818 line 58905 section rename() EXDEV, change:
The links named by new and old are on different file systems and the implementation does not support links between file systems.
to:
The file named by old and the directory in which the directory entry named by new is to be created or replaced are on different file systems and the implementation does not support hard links between file systems.

On page 2197 line 70216 section unlink(), change:
The unlink() function shall remove a link to a file. If path names a symbolic link, unlink() shall remove the symbolic link named by path and shall not affect any file or directory named by the contents of the symbolic link. Otherwise, unlink() shall remove the link named by the pathname pointed to by path and shall decrement the link count of the file referenced by the link.
to:
The unlink() function shall remove the directory entry named by path and shall decrement the link count of the file referenced by the directory entry. If path names a symbolic link, unlink() shall remove the symbolic link and shall not affect any file named by the contents of the symbolic link.


Cross-volume changes to XCU ...

On page 2897 line 95692 section ln, change:
In the first synopsis form, the ln utility shall create a new directory entry (link) at the destination path specified by the target_file operand. If the -s option is specified, a symbolic link shall be created for the file specified by the source_file operand.
to:
In the first synopsis form, the ln utility shall create a new directory entry at the destination path specified by the target_file operand. If the -s option is specified, a symbolic link shall be created with the contents specified by the source_file operand (which need not name an existing file); otherwise, a hard link shall be created to the file named by the source_file operand.

On page 2897 line 95697 section ln, change:
In the second synopsis form, the ln utility shall create a new directory entry (link), or if the -s option is specified a symbolic link, for each file specified by a source_file operand, at a destination path in the existing directory named by target_dir.
to:
In the second synopsis form, the ln utility shall create a new directory entry for each source_file operand, at a destination path in the existing directory named by target_dir. If the -s option is specified, a symbolic link shall be created with the contents specified by each source_file operand (which need not name an existing file); otherwise, a hard link shall be created to each file named by a source_file operand.

On page 2899 line 95743,95745 section ln (-L & -P options), change:
create a (hard) link
to:
create a hard link

On page 3075 line 102555 section pax (-u option), change:
In copy mode, the file in the destination hierarchy shall be replaced by the file in the source hierarchy or by a link to the file in the source hierarchy if the file in the source hierarchy is newer.
to:
In copy mode, the file in the destination hierarchy shall be replaced if the file in the source hierarchy is newer.

On page 3087 line 103039 section pax (ustar Interchange Format), change:
1 (a link) or 2 (a symbolic link)
to:
1 (a hard link) or 2 (a symbolic link)

On page 3093 line 103232 section pax (CONSEQUENCES OF ERRORS), change:
cannot create a link to a file
to:
cannot create a hard link to a file

On page 3101 line 103596 section pax (RATIONALE), change:
Links are recorded in the fashion described here because a link can be to any file type. It is desirable in general to be able to restore part of an archive selectively and restore all of those files completely. If the data is not associated with each link, it is not possible to do this. However, the data associated with a file can be large, and when selective restoration is not needed, this can be a significant burden. The archive is structured so that files that have no associated data can always be restored by the name of any link name of any link, and the user may choose whether data is recorded with each instance of a file that contains data. The format permits mixing of both types of links in a single archive; this can be done for special needs, and pax is expected to interpret such archives on input properly, despite the fact that there is no pax option that would force this mixed case on output.
to:
Hard links are recorded in the fashion described here because a hard link can be to any file type. It is desirable in general to be able to restore part of an archive selectively and restore all of those files completely. If the data is not associated with each hard link, it is not possible to do this. However, the data associated with a file can be large, and when selective restoration is not needed, this can be a significant burden. The archive is structured so that files that have no associated data can always be restored by the name of any link name of any hard link, and the user can choose whether data is recorded with each instance of a file that contains data. The format permits mixing of hard links with data and hard links without data in a single archive; this can be done for special needs, and pax is expected to interpret such archives on input properly, despite the fact that there is no pax option that would force this mixed case on output.

On page 3200 line 107351 section rm (APPLICATION USAGE), change:
do not permit the removal of the last link
to:
do not permit the removal of the last hard link


Cross-volume changes to XRAT ...

On page 3491 line 117998 section A.3 Definitions, delete:
Directory Entry

Throughout POSIX.1-2017, the term ``link'' is used (about the link() function, for example) in describing the objects that point to files from directories.

There are no notes attached to this issue.




Viewing Issue Advanced Details
1379 [1003.1(2016)/Issue7+TC2] Shell and Utilities Objection Error 2020-07-20 08:36 2020-07-20 08:36
geoffclare
 
normal  
New  
Open  
   
Geoff Clare
The Open Group
pax
3078
102671-102680
---
pax environment variables and affirmative responses
The pax utility has the usual boilerplate in the ENVIRONMENT VARIABLES section that utilities which read yes/no responses have. However, it doesn't prompt for that type of response. (The responses for -i are pathnames, not a y/n response.)

Also, LC_COLLATE refers to the pattern operand and the -s option in the part about equivalence classes, but they are missing from the corresponding LC_CTYPE text about character classes.
On page 3078 line 102671 section pax, change:
... used in the pattern matching expressions for the pattern operand, the basic regular expression for the -s option, and the extended regular expression defined for the yesexpr locale keyword in the LC_MESSAGES category.
to:
... used in the pattern matching expressions for the pattern operand and the basic regular expression for the -s option.

On page 3078 line 102674 section pax, change:
Determine the locale for the interpretation of sequences of bytes of text data as characters (for example, single-byte as opposed to multi-byte characters in arguments and input files), the behavior of character classes used in the extended regular expression defined for the yesexpr locale keyword in the LC_MESSAGES category, and pattern matching.
to:
Determine the locale for the interpretation of sequences of bytes of text data as characters (for example, single-byte as opposed to multi-byte characters in arguments and input files), and the behavior of character classes used in the pattern matching expressions for the pattern operand and the basic regular expression for the -s option.

On page 3078 line 102680 section pax, change:
Determine the locale used to process affirmative responses, and the locale used ...
to:
Determine the locale used ...

There are no notes attached to this issue.




Viewing Issue Advanced Details
1378 [1003.1(2016)/Issue7+TC2] Shell and Utilities Objection Omission 2020-07-17 15:39 2020-07-17 15:39
geoffclare
 
normal  
New  
Open  
   
Geoff Clare
The Open Group
ex
2699
88112
---
ex LC_MESSAGES and affirmative responses
Utilities that read yes/no responses have some boilerplate in the LC_MESSAGES entry in ENVIRONMENT VARIABLES that is missing from the ex page.

It is needed because the s command has a c flag that asks for confirmation.

(Note that the LC_COLLATE and LC_CTYPE entries do not need to mention yesexpr explicitly the way some utilities do, since they already cover that as part of their general statement about regular expressions.)
Change:
Determine the locale that ...
to:
Determine the locale used to process affirmative responses, and the locale that ...

There are no notes attached to this issue.




Viewing Issue Advanced Details
1377 [1003.1(2016)/Issue7+TC2] Shell and Utilities Editorial Error 2020-07-17 09:32 2020-07-17 09:32
geoffclare
 
normal  
New  
Open  
   
Geoff Clare
The Open Group
at
2474
79386
---
at -q addition from POSIX.2b incorrectly applied
The description of the at -q option says:
If -q is specified along with either of the -t time_arg or timespec arguments, the results are unspecified.

This text is the result of an editorial mistake when applying the changes from POSIX.2b, where the addition to be made was given as:
If -q b is specified ...

I.e. this is about what happens if you specify the batch queue and also specify a time/date.
Change:
If -q is specified
to:
If -q b is specified

There are no notes attached to this issue.




Viewing Issue Advanced Details
1376 [1003.1(2016)/Issue7+TC2] System Interfaces Objection Error 2020-07-16 08:19 2020-07-16 08:19
geoffclare
 
normal  
New  
Open  
   
Geoff Clare
The Open Group
asctime(), ctime(), gmtime(), localtime()
600, 727, 1113, 1265
20907, 24771, 37681, 42218
---
CX shading on C99 time conversion functions
The descriptions of asctime(), ctime(), gmtime(), and localtime() should not have CX shading on this paragraph:
The asctime(), ctime(), gmtime(), and localtime() functions shall return values in one of two static objects: a broken-down time structure and an array of type char. Execution of any of the functions may overwrite the information returned in either of these objects by any of the other functions.

The text is derived from the introductory paragraph for time conversion functions in C99 (7.23.3 para 1 in N1256). There is also a wording difference with C99 that is worth fixing.
On page 600 line 20907 section asctime(),
page 727 line 24771 section ctime(),
page 1113 line 37681 section gmtime(),
page 1265 line 42218 section localtime(), remove CX shading from:
The asctime(), ctime(), gmtime(), and localtime() functions shall return values in one of two static objects: a broken-down time structure and an array of type char. Execution of any of the functions may overwrite the information returned in either of these objects by any of the other functions.
and change it to:
The asctime(), ctime(), gmtime(), and localtime() functions shall return values in one of two static objects: a broken-down time structure and an array of type char. Execution of any of the functions that return a pointer to one of these object types may overwrite the information in any object of the same type pointed to by the value returned from any previous call to any of them.

There are no notes attached to this issue.




Viewing Issue Advanced Details
1375 [1003.1(2016)/Issue7+TC2] System Interfaces Objection Error 2020-07-15 14:08 2020-07-15 14:09
geoffclare
 
normal  
New  
Open  
   
Geoff Clare
The Open Group
scanf(), fwscanf()
950, 1004, 1006
32272, 34177, 34248, 34263
---
*scanf() 'm' allocation char v. wchar_t problems
Various parts of the text relating to the 'm' allocation character on the fscanf() and fwscanf() pages do not correctly account for the difference between conversions that use the 'l' modifier and those that don't.

In some cases the text is improved by bug 0001173 but is still affected.
On page 950 line 32272 section fscanf(), after applying bug 1173 change:
... terminating null character.
to:
... terminating null character or wide character.

On page 1004 line 34177 section fwscanf(), after applying bug 1173 change:
... terminating null wide character.
to:
... terminating null character or wide character.

On page 1006 line 34248,34263 section fwscanf(), change two occurrences of:
wchar_t
to:
char
(This results in 's' and '[' each having one char and one wchar_t, the same as 'c'.)
There are no notes attached to this issue.




Viewing Issue Advanced Details
1374 [1003.1(2016)/Issue7+TC2] System Interfaces Objection Error 2020-07-15 09:30 2020-07-15 14:22
geoffclare
 
normal  
New  
Open  
   
Geoff Clare
The Open Group
ungetwc()
2195
70178
---
Value of file-position indicator after ungetwc()
The ungetwc() page says "The file-position indicator is decremented (by one or more) by each successful call to ungetwc(); if its value was 0 before a call, its value is unspecified after the call."

(Bug 0000701 changes "indicator is decremented" to "indicator for the stream shall be decremented".)

This implies that if the position is at 1 byte and ungetwc() pushes back a wide character that converts to 2 bytes, then the position is required to be set to -1.

The current text derives from XPG4 and does not match C99 (and is not CX shaded). The C99 text is the same as the original (1995) MSE spec. It says "the value of its file position indicator after a successful call to the ungetwc function is unspecified until all pushed-back wide characters are read or discarded".

It appears that when the MSE was incorporated into SUSv2 this discrepancy went unnoticed and the defective XPG4 text has been retained until now.
After applying bug 701, change:
The file-position indicator for the stream shall be decremented (by one or more) by each successful call to ungetwc(); if its value was 0 before a call, its value is unspecified after the call. The value of the file-position indicator after all pushed-back characters have been read shall be the same as it was before the characters were pushed back.
to:
The value of the file-position indicator for the stream after a successful call to ungetwc() is unspecified until all pushed-back wide characters are read or discarded; its value after all pushed-back wide characters have been read shall be the same as it was before the wide characters were pushed back.

Notes
(0004897)
shware_systems   
2020-07-15 14:22   
I think both XPG4 and the C standard are in error. The behavior is nominally unspecified only when the effective file position is less than the number of chars to be pushed back. Otherwise the position will be greater than or equal to zero. The ungetc() description reflects this generalization for its particular case where the char count is 1; 0 as position on entry is the only problematic value for most physical media.

Note both descriptions are only valid for the limiting case where the bit width for a byte of physical media equals the CHAR_BITS value. Additional complications neither standard addresses exist when this isn't the case. As exemplified by the <termios.h> header this isn't the general case for serial ports accessed via FILE * records, where the bit width may be from 5 to 8. For the C standard, where CHAR_BITS may be up to 15, this applies also when physical media uses 8 bit bytes, the expected behavior is left undefined, not even unspecified.




Viewing Issue Advanced Details
1373 [1003.1(2016)/Issue7+TC2] Shell and Utilities Editorial Error 2020-07-14 09:01 2020-07-14 09:01
geoffclare
 
normal  
New  
Open  
   
Geoff Clare
The Open Group
test
3288
110620
---
The '[' utility does not conform to syntax guidelines 1 and 2
As noted in Note: 0004769 in bug 252, the description of the '[' form of the test utility needs to be updated to reflect the fact that it does not conform to Guidelines 1 and 2.
After:
The test utility shall not recognize the "--" argument in the manner specified by Guideline 10 in XBD Section 12.2 (on page 216).
add a new sentence:
In addition, when the utility name used is [ the utility does not conform to Guidelines 1 and 2.

There are no notes attached to this issue.




Viewing Issue Advanced Details
1372 [1003.1(2016)/Issue7+TC2] Rationale Comment Error 2020-07-13 09:25 2020-07-13 09:25
geoffclare
 
normal  
New  
Open  
   
Geoff Clare
The Open Group
A.3 (Mount Point)
3498
118283
---
Rationale says there is no mount point definition, but there is
XRAT A.3 has an entry for "Mount Point" with an asterisk, indicating that there is no definition for "Mount Point". This entry should have been removed when the definition of "Mount Point" was added to XBD.

There is also a cross-reference to this entry under "Root of a File System".
On page 3498 line 118283 section A.3, delete:
Mount Point*

The directory on which a ``mounted file system'' is mounted. This term, like mount() and umount(), was not included because it was implementation-defined.

On page 3501 line 118411 section A.3, change:
Implementation-defined; see Mount Point* (on page 3498).
to:
Commonly used to refer to a mount point; this standard uses the latter.
There are no notes attached to this issue.




Viewing Issue Advanced Details
1371 [1003.1(2016)/Issue7+TC2] Base Definitions and Headers Objection Clarification Requested 2020-07-10 14:29 2020-07-10 14:29
geoffclare
 
normal  
New  
Open  
   
Geoff Clare
The Open Group
3.370 Stream
93
2583
---
Definition of stream is XSH-specific
The definition of the term "stream" says it is a "file access object [...] as described by the ISO C standard" and says these objects can be created using fdopen(), etc.

This is XSH-specific, but the term needs to apply to XCU as well. In particular, the current definitions of "standard error", "standard input" and "standard output" are the ones from POSIX.2-1992 and they use the word "stream". In that standard, the definition of "stream" was:
An ordered sequence of characters, as described by the C Standard
which, although it also refers to the C Standard, does not associate "stream" with a file access object.

The definition also says streams are "associated with a file descriptor", but this is not true for streams opened with fmemopen() or open_memstream().
Change:
Appearing in lowercase, a stream is a file access object that allows access to an ordered sequence of characters, as described by the ISO C standard. Such objects can be created by the fdopen(), fmemopen(), fopen(), open_memstream(), or popen() functions, and are associated with a file descriptor.
to:
Appearing in lowercase, a stream is an ordered sequence of characters, as described by the ISO C standard.

In the shell command language, each stream is associated with a file descriptor. These can be opened using redirection operators.

<small>Note:
Redirection is defined in detail in [xref to XCU 2.7].</small>
In the C language, each stream is accessed via a file access object and is either a stream associated with a file descriptor or a memory stream. A file access object associated with a file descriptor can be created by the fdopen(), fopen(), or popen() functions. A file access object for a memory stream can be created by the fmemopen() or open_memstream() functions.
There are no notes attached to this issue.




Viewing Issue Advanced Details
1370 [Issue 8 drafts] Rationale Editorial Error 2020-07-10 09:21 2020-07-10 09:21
geoffclare
 
normal  
New  
Open  
   
Geoff Clare
The Open Group
B.1.1, C.1.1
3433, 3555
117421, 122714
Where change history starts
XRAT B.1.1 says:
The CHANGE HISTORY section for each entry details the technical changes that have been made to that entry from Issue 5. Changes between earlier versions of the base document and Issue 5 are not included.

and C.1.1 says:
The CHANGE HISTORY section for each utility describes technical changes made to that utility from Issue 5. Changes between earlier versions of the base document and Issue 5 are not included.

These statements are not actually true. The change history starts with an "Issue 5" subheading, but changes listed under that subheading give the changes made in Issue 5, not the changes from Issue 5. So the changes between Issue 4 Version 2 and Issue 5 are included, contrary to what the second sentence of each of those paragraphs says.

However, rather than fixing this text, perhaps what we should do is make it true. I.e. remove change history before the "Issue 6" subheading.

Looking back at how much change history was included in old versions:
  • XPG4 only had 1 heading: "Issue 4"
  • SUSv1 had 2: "Issue 4" and "Issue 4, Version 2"
  • SUSv2 had 3: "Issue 4", "Issue 4, Version 2", and "Issue 5"
  • SUSv3 had 2: "Issue 5" and "Issue 6"
  • SUSv4 had 3: "Issue 5", "Issue 6" and "Issue 7"
If we don't remove "Issue 5" in SUSv5, it will have 4 headings which is more than any previous version. Also, since Issue 6 was the POSIX.1/SUS merge, it seems like a good place to start the change history.
On each header, function, built-in utility, and utility page that has an "Issue 5" subheading under CHANGE HISTORY, delete from that heading down to (but not including) the next subheading.
There are no notes attached to this issue.




Viewing Issue Advanced Details
1369 [Issue 8 drafts] Rationale Comment Error 2020-07-09 11:11 2020-07-09 11:12
geoffclare
 
normal  
New  
Open  
   
Geoff Clare
The Open Group
A.8.3
3407
116496
COLUMNS and LINES rationale
Bug 0001185 made changes to the descriptions of COLUMNS and LINES in XBD, but did not change XRAT to match.
Change:
The default values for the number of column positions, COLUMNS, and screen height, LINES, are unspecified because historical implementations use different methods to determine values corresponding to the size of the screen in which the utility is run. This size is typically known to the implementation through the value of TERM, or by more elaborate methods such as extensions to the stty utility or knowledge of how the user is dynamically resizing windows on a bit-mapped display terminal.
to:
The default values for the number of column positions when COLUMNS is unset or null, and screen height when LINES is unset or null, are unspecified if the terminal window size cannot be obtained (from tcgetwinsize()) because historical implementations use different methods to determine the values.

There are no notes attached to this issue.




Viewing Issue Advanced Details
1368 [1003.1(2016)/Issue7+TC2] Shell and Utilities Editorial Enhancement Request 2020-07-08 23:10 2020-07-08 23:10
steffen
 
normal  
New  
Open  
   
Steffen Nurpmeso
Vol. 3: Shell and Utilities, at, batch
2479, 2523
79636 ff., 81458 ff.
---
Unworldly use of redirection and mailx(1) in at(1) and batch(1) examples.
The sequence

 at now + 1 hour <<!
 diff file1 file2 2>&1 >outfile | mailx mygroup
 !

(likewise for batch(1)) sends out a possibly empty message to the work group "mygroup", which is a very low quality example that seems unworldly.
If the -E option of mailx(1) becomes standardized (as requested in issue #1367) the above could become wonderful examples for the power of the *X shell environment, and simply be changed to

 at now + 1 hour <<!
 diff file1 file2 2>&1 >outfile | mailx -E mygroup
 !

(likewise for batch(1)). Otherwise change

 at now + 1 hour <<!
 diff file1 file2 2>&1 >outfile | mailx -E mygroup
 !

to

 2. This sequence, which demonstrates redirecting standard error to a pipe, and carefully avoiding of false conditional status codes, is useful in a command procedure (the sequence of output redirection specifications is significant),

 at now + 1 hour <<!
 exec >errfile
 diff file1 file2 2>&1 >outfile
 if [ -s errfile ]; then
   < errfile mailx mygroup
 fi
 !
There are no notes attached to this issue.




Viewing Issue Advanced Details
1367 [1003.1(2016)/Issue7+TC2] Shell and Utilities Editorial Enhancement Request 2020-07-08 22:55 2020-07-09 13:25
steffen
 
normal  
New  
Open  
   
Steffen Nurpmeso
Vol. 3: Shell and Utilities, mailx
2943, 2944
97431, 97471 ff.
---
mailx: add -E option to discard (not send) empty messages
The mailx variants of Apple, NetBSD, OpenBSD as well as other Open Source incarnations, support a command line option -E that rejects sending empty messages, successfully.

This is very helpful in scripted use cases since possible error notifications will only be send out if necessary.

The backing value ("INTERNAL VARIABLE") is different (skipemptybody, dontsendempty, skipempty, to name a few), but the behaviour of the -E option itself is identical.

The only known mailx incarnation which does not support -E is Solaris (OpenIndiana inspected), but the implementation is simplicistic since mailx warns on empty messages, the code change would need to turn

  if (fsize(mtf) == 0 && hp->h_subject == NOSTR) {
    printf(gettext("No message !?!\n"));
    goto out;
  }

into

  if (fsize(mtf) == 0) {
    if (value("dontsendempty") != NOSTR)
      goto jout;
    if (hp->h_subject == NOSTR) {
      printf(gettext("No message !?!\n"));
      goto out;
    }
  }
On page 2943, line 97431, change

  mailx [−s subject] address...

into

  mailx [-E] [−s subject] address...

On page 2944, insert after line 97471

  -E Discard messages with an empty message body, successfully.
Notes
(0004895)
geoffclare   
2020-07-09 08:01   
Line 98259 in EXIT STATUS also needs to change. I suggest changing:
Successful completion; note that this status implies that all messages were sent, ...
to:
Successful completion; note that this status implies that all messages were sent, or successfully discarded (see -E), ...


(0004896)
steffen   
2020-07-09 13:25   
..and the Solaris / OpenIndiana should allow all-empty messages by default, nothing is wrong with them, they are the shortest possible mail-based notification ("ping"), which in practice is nice since in practice the MTA / LDA (Mail-Transfer-Agent, Local-Delivery-Agent) adds at least a so-called From_, but especially the former also a From: line, for example

  #?0|kent:steffen$ </dev/null mailx -:/ root
  mailx: No message, no subject; hope that's ok
  #?0|kent:steffen$ tail -n 13 /var/spool/mail/steffen

  From steffen@localhost Thu Jul 9 15:22:56 2020
  Received: from steffen (uid 1000)
          (envelope-from steffen@localhost)
          id a791
          by kent (DragonFly Mail Agent v0.13);
          Thu, 09 Jul 2020 15:22:56 +0200
  Date: Thu, 09 Jul 2020 15:22:56 +0200
  To: root
  User-Agent: mailx v14.9.19
  Message-Id: <5f071a30.a791.1071c30e@kent>
  From: <steffen@localhost>

  #?0|kent:steffen$

The BSD based code does

        if (fsize(mtf) == 0) {
                if (value("skipempty") != NULL)
                        goto out;
                if (hp->h_subject == NULL || *hp->h_subject == '\0')
                        puts("No message, no subject; hope that's ok");
                else
                        puts("Null message body; hope that's ok");
        }

Which, finally, and as an off-topic note, makes me think the root of the related issue 0001368 was caused by experiences with SysV based mail.




Viewing Issue Advanced Details
1366 [Issue 8 drafts] Rationale Comment Error 2020-07-08 15:20 2020-07-08 15:20
geoffclare
 
normal  
New  
Open  
   
Geoff Clare
The Open Group
A.3 Definitions (XSI)
3379
115320
Out of date rationale for XSI definition
The rationale for the XSI definition says:
The term ``XSI'' has been used for 10 years in connection with the XPG series and the first and second versions of the base volumes of the Single UNIX Specification. The XSI margin code was introduced to denote the extended or more restrictive semantics beyond POSIX that are applicable to UNIX systems.
The use of "has" was okay when the text was written (for Issue 6), but is now a problem.
Change:
The term ``XSI'' has been used for 10 years ...
to:
When POSIX.1 and the Single UNIX Specification were merged, the term ``XSI'' had been used for over 10 years ...
There are no notes attached to this issue.




Viewing Issue Advanced Details
1365 [Issue 8 drafts] Shell and Utilities Objection Error 2020-07-07 09:36 2020-07-07 09:37
geoffclare
 
normal  
New  
Open  
   
Geoff Clare
The Open Group
rm
3066
103824
rm -v description implies only operands are reported
The changes from bug 0001154 to add rm -v were supposed to require that rm reports each file that it removes. However, the description of -v is:
After file has been removed, write a message to standard output indicating that it has been removed.
which implies that rm -rv only reports the removal of file operands and does not report the removal of files within a directory specified as an operand.

The changes at line 103806 correctly state that each removed file is reported.

Line 103858 in STDOUT also has two minor problems:
  • Use of "the file" is odd, as if rm only ever removes one file.
  • The phrase "being removed" implies that something could be written before the removal attempt is made, rather than (as per the other changes) only after successful removal.

On page 3066 line 103824 section rm, change:
After file has been removed
to:
After each file has been removed

On page 3067 line 103858 section rm, change:
information about the file being removed
to:
information about each removed file
There are no notes attached to this issue.




Viewing Issue Advanced Details
1364 [Issue 8 drafts] Shell and Utilities Comment Clarification Requested 2020-07-06 10:45 2020-07-06 10:46
geoffclare
 
normal  
New  
Open  
   
Geoff Clare
The Open Group
2.7.2 Redirecting Output
2287
73909
use of noclobber with files in /tmp
Some time after bug 0001016 was approved as an interpretation, Note: 0003655 was added in which McDutchie observed that the statement "Performing these operations atomically ensures that the creation of lock files and unique (often temporary) files is reliable" is not true if the file is in a public directory such as /tmp, since another user could create a FIFO or device file with the same name.
On page 2287 line 73900 section 2.7.2, and
page 3587 line 124071 section C.2.7.2, add to the "Notes to Reviewers":
If option 2 is adopted in a future draft, note that the change from <URL> will need to be reapplied to the option 2 text.

where <URL> is the URL for this bug.

On page 2287 line 73909 section 2.7.2, change:
Performing these operations atomically ensures that the creation of lock files and unique (often temporary) files is reliable.
to:
Performing these operations atomically ensures that the creation of lock files and unique (often temporary) files is reliable, provided the files are created in a private directory (i.e. not in /tmp or similar).

On page 3588 line 124127 section C.2.7.2, change:
The standard developers consider this to be of less importance than ensuring that the creation of lock files is reliable.
to:
The standard developers consider this to be of less importance than ensuring that the creation of lock files is reliable (in a private directory).

Creation of lock files and unique (often temporary) files with noclobber set is only reliable provided the files are created in a private directory. If a directory such as /tmp is used where other users can create files, then another user could create a FIFO (or a device file, given sufficient privilege) with the same name, causing multiple redirections with noclobber set to open the existing non-regular file instead of one succeeding and the others failing.

There are no notes attached to this issue.




Viewing Issue Advanced Details
1363 [Issue 8 drafts] System Interfaces Comment Enhancement Request 2020-07-03 13:44 2020-07-03 13:44
geoffclare
 
normal  
New  
Open  
   
Geoff Clare
The Open Group
wait()
2154
69578
out of date wait() RATIONALE regarding core dump indication
The text:
Implementations that support implementation-defined actions, such as the creation of a file containing a core image, on termination of some processes traditionally provide a bit in the status returned by wait() to indicate that such actions have occurred.

is out of date because the way to query this using WCOREDUMP() has now been added to the standard.
Change:
Implementations that support implementation-defined actions, such as the creation of a file containing a core image, on termination of some processes traditionally provide a bit in the status returned by wait() to indicate that such actions have occurred.
to:
On implementations that support the creation of a file containing a core image on some process terminations, the WCOREDUMP(stat_val) macro indicates whether creation of a core image was attempted. If it returns a non-zero value this does not necessarily mean that the core image was created, only that it was attempted. For example, if the RLIMIT_CORE limit for the process is 0, this prevents creation of the file; WCOREDUMP(stat_val) returning non-zero in this case indicates that the file would have been created if the limit had not been 0.
There are no notes attached to this issue.




Viewing Issue Advanced Details
1362 [Issue 8 drafts] System Interfaces Editorial Error 2020-07-03 09:35 2020-07-03 09:35
geoffclare
 
normal  
New  
Open  
   
Geoff Clare
The Open Group
posix_spawn()
1428
47825-47851
small editorial fixes to posix_spawn()
In the ERRORS section there are several occurrences of:
(or, if the error occurs after the calling process successfully returns, the child process shall exit with exit status 127)

This is intended to state a requirement (hence the use of "shall") and therefore should not be in parentheses. Only one occurrence is new in Issue 8 draft 1, but the others need to be fixed as well.

Another small editorial fix that doesn't seem worth submitting a separate bug for is a use of the phrase "relative file names".
On page 1428 line 47825-47851 section posix_spawn(), change 6 occurrences of:
... (or, if the error occurs after the calling process successfully returns, the child process shall exit with exit status 127).
to:
...; or, if the error occurs after the calling process successfully returns, the child process shall exit with exit status 127.

On page 1426 line 47733 section posix_spawn(), change:
relative file names
to:
relative pathnames
There are no notes attached to this issue.




Viewing Issue Advanced Details
1361 [Issue 8 drafts] System Interfaces Objection Error 2020-07-02 13:52 2020-07-02 13:53
geoffclare
 
normal  
New  
Open  
   
Geoff Clare
The Open Group
fork()
877
29947
fork() changes incomplete
The changes from bug 0000062 to add _Fork() and make fork() non-async-signal-safe missed some things on the fork() page:

The text "When the application calls fork() from a signal handler and any of the fork handlers registered by pthread_atfork() calls a function that is not async-signal-safe, the behavior is undefined" in one of the later bullet items is redundant now that fork() itself is not async-signal-safe.

There is a paragraph in RATIONALE explaining why the above text is there.
On page 877 line 29947 section fork(), delete:
When the application calls fork() from a signal handler and any of the fork handlers registered by pthread_atfork() calls a function that is not async-signal-safe, the behavior is undefined.

On page 879 line 30059 section fork(), delete:
While the fork() function is async-signal-safe, there is no way for an implementation to determine whether the fork handlers established by pthread_atfork() are async-signal-safe. The fork handlers may attempt to execute portions of the implementation that are not async-signal- safe, such as those that are protected by mutexes, leading to a deadlock condition. It is therefore undefined for the fork handlers to execute functions that are not async-signal-safe when fork() is called from a signal handler.
There are no notes attached to this issue.




Viewing Issue Advanced Details
1360 [Issue 8 drafts] System Interfaces Objection Clarification Requested 2020-07-01 15:59 2020-07-01 15:59
geoffclare
 
normal  
New  
Open  
   
Geoff Clare
The Open Group
fdopendir()
816, 818
27933, 28030
opendir() file descriptor requirements
There are three issues with the description of opendir() related to the fd associated with a directory stream.

1. The phrase "If the type DIR is implemented using a file descriptor" is problematic, given that dirfd() no longer has an ENOTSUP error. I.e. For Issue 8, DIR always has the ability to store a fd; the optionality now is not if but when: the fd can be opened either by opendir() or by dirfd(). The suggested changes refer to the fd as "associated with" the stream, as that is the phrase used on the dirfd() page.

2. It describes opening with O_DIRECTORY and then in a separate paragraph says that FD_CLOEXEC is set. This implies that setting FD_CLOEXEC can be done separately, whereas to avoid race conditions the open should be done as if with O_CLOEXEC. (As is already required for dirfd().)

3. The wording "applications shall only be able to open up to a total of {OPEN_MAX} files and directories" has the same problem that all the EMFILE descriptions used to have. (They were reworded to use "file descriptors available to the process".)

Issue 1 also affects closedir().

In addition to these issues in the normative text, there are problems in RATIONALE too.
On page 816 line 27933 section fdopendir(), change:
If the type DIR is implemented using a file descriptor, applications shall only be able to open up to a total of {OPEN_MAX} files and directories.

If the type DIR is implemented using a file descriptor, the descriptor shall be obtained as if the O_DIRECTORY flag was passed to open().

If the type DIR is implemented using a file descriptor and a directory stream is opened by a successful call to opendir(), the FD_CLOEXEC flag shall be set on the file descriptor; see [xref to <fcntl.h>].
to:
If opendir() opens a file descriptor for dirname to associate with the returned stream:
  • The descriptor shall be allocated as if the O_DIRECTORY and O_CLOEXEC flags were passed to open().

  • The descriptor shall be subject to the limit of {OPEN_MAX} file descriptors available to the process.

On page 818 line 28030 section fdopendir(), change:
Based on historical implementations, the rules about file descriptors apply to directory streams as well. However, this volume of POSIX.1-202x does not mandate that the directory stream be implemented using file descriptors. The description of closedir() clarifies that if a file descriptor is used for the directory stream, it is mandatory that closedir() deallocate the file descriptor. When a file descriptor is used to implement the directory stream, opendir() behaves as if the FD_CLOEXEC flag had been set for the file descriptor.
to:
Based on historical implementations, the rules about file descriptors apply to directory streams as well. However, this volume of POSIX.1-202x does not mandate that opendir() opens a file descriptor to associate with the stream; this may instead be done by the first call to dirfd(), thus avoiding the need to allocate a file descriptor if dirfd() is never called. Once a file descriptor has been associated with the stream, it is mandatory that closedir() deallocate the file descriptor. If opendir() opens a file descriptor to associate with the stream, it behaves as if the O_CLOEXEC flag for open() had been used, so that the FD_CLOEXEC flag is set for the file descriptor.

On page 819 line 28079 section fdopendir(), change:
if the type DIR is implemented using a file descriptor
to:
if it associates a file descriptor with the returned stream.

On page 667 line 23091 section closedir(), change:
If a file descriptor is used to implement type DIR, that file descriptor shall be closed.
to:
If there is a file descriptor associated with the stream (whether opened by opendir() or dirfd(), or passed to fdopendir() when creating the stream), that file descriptor shall be closed by closedir().

There are no notes attached to this issue.




Viewing Issue Advanced Details
1359 [Issue 8 drafts] System Interfaces Comment Error 2020-06-30 15:14 2020-06-30 15:15
geoffclare
 
normal  
New  
Open  
   
Geoff Clare
The Open Group
dirfd()
712
24548
dirfd() rationale out of date
Bug 0000391 made changes to the normative text for dirfd() but did not update the RATIONALE to match.
Change:
The description uses the term ``a file descriptor'' rather than ``the file descriptor''. The implication intended is that an implementation that does not use an fd for opendir() could still open() the directory to implement the dirfd() function. Such a descriptor must be closed later during a call to closedir().

If it is necessary to allocate an fd to be returned by dirfd(), it should be done at the time of a call to opendir().
to:
On an implementation where reading from a directory stream does not use a file descriptor, opendir() need not allocate one to be returned by dirfd(). The implementation can instead delay the allocation of a suitable file descriptor until the first time dirfd() is called for the stream. A file descriptor allocated by dirfd() must be closed by closedir().

There are no notes attached to this issue.




Viewing Issue Advanced Details
1358 [Issue 8 drafts] System Interfaces Objection Error 2020-06-30 14:42 2020-06-30 14:43
geoffclare
 
normal  
New  
Open  
   
Geoff Clare
The Open Group
basename()
593
20765
basename/dirname difference for thread termination
The description of basename() includes the sentence "The returned pointer might also be invalidated if the calling thread is terminated", but dirname() does not. Bug 0001064 removed it from dirname(), but didn't do so for basename(). I believe this was accidental, resulting from the relevant paragraph being in different places on the two pages (RETURN VALUE v. DESCRIPTION).
Delete:
The returned pointer might also be invalidated if the calling thread is terminated.

There are no notes attached to this issue.




Viewing Issue Advanced Details
1357 [Issue 8 drafts] Base Definitions and Headers Objection Omission 2020-06-30 14:16 2020-06-30 14:59
geoffclare
 
normal  
New  
Open  
   
Geoff Clare
The Open Group
<unistd.h>
429
14938
SEEK_* distinct values requirement
Bug 0000415 added SEEK_HOLE and SEEK_DATA with a requirement that they have distinct values, but they should be required to be distinct from each other and from SEEK_CUR, SEEK_END, and SEEK_SET.
Change:
with distinct values
to:
with values that are distinct from each other and from SEEK_CUR, SEEK_END, and SEEK_SET

Notes
(0004892)
shware_systems   
2020-06-30 14:59   
The 'with distinct values' I see as superfluous there, actually. No harm, just not really required for clarity. It belongs imo more with the paragraph where <unistd.h> shall declare them, as simply 'mutually exclusive values', possibly '... non-zero values'.




Viewing Issue Advanced Details
1356 [Issue 8 drafts] Base Definitions and Headers Editorial Enhancement Request 2020-06-29 12:47 2020-06-29 12:47
dannyniu
 
normal  
New  
Open  
   
DannyNiu/NJF
3.57 Character
37
1241
Our definition of character disagrees with that of Unicode.
In Draft 1, a character is being defined as:

> A sequence of one or more bytes representing
> a single graphic symbol or control code.

This definition falls apart when applied in the context of e.g. Arabic text.

The Unicode standard (version 13, page 15) says:

> The Unicode Standard draws a distinction
> between characters and glyphs. Characters are
> the abstract representations of the
> smallest components of written language
> that have semantic value.

Considering Unicode has a radically different goal than POSIX,
I propose the following new definition for consideration
(as the current wording may be intended, even not on purpose)

> A sequence of bytes that is considered an
> individual unit in text processing.

Explanation:

1) A sequence of bytes: remains the same as our original definition.

2) individual unit: graphic symbols in Arabic text are composed of parts that're sometimes stacked on top of each other. Older iOS that didn't take this into account caused the iPhone Arabic Glitch.

3) text processing: this can refer to terminal (and emulators) processing control sequences, and `wc -m` counting characters. Defining characters in terms of text processing lifts the burdon of relying on the concept of "code point" externally defined in the Unicode standard.

Consider applying the the proposed new definition.
There are no notes attached to this issue.




Viewing Issue Advanced Details
1355 [Issue 8 drafts] Base Definitions and Headers Editorial Enhancement Request 2020-06-29 11:46 2020-06-29 11:46
dannyniu
 
normal  
New  
Open  
   
DannyNiu/NJF
2.1.5.2 XSI Option Groups
20
703
Better wording to replace "decoding algorithm"
In the Encryption Option Group, there's the text:

> Due to export restrictions on the decoding algorithm in some countries

The word "decoding" is inappropriate in that:

1. decoding is (strictly speaking) an unkeyed process (e.g. base64)

2. even if there's a key, encryption isn't necessarily the only functionality that may be restricted by the customs.

Some wording that I believe may be better include: "cipher algorithm" which specifically refers to encryption functionalities, or "cryptographic algorithm" which is completely generic.
Change "decoding" to "cipher" or "cryptographic".
There are no notes attached to this issue.




Viewing Issue Advanced Details
1354 [Issue 8 drafts] System Interfaces Editorial Error 2020-06-28 15:12 2020-06-29 09:48
dennisw
 
normal  
New  
Open  
   
Dennis Wölfing
strftime
1966
63929-63930
Bug 1313 applied incorrectly
Bug 0001313 was applied incorrectly. It seems like the Desired Action was applied instead of the accepted text for that bug. Also an opening parenthesis is missing.
On page 1966 lines 63929-63930 change
±YYYYY-MM-DD)
to
YYYYY-MM-DD, i.e. with a 5 or more digit year)
Notes
(0004890)
geoffclare   
2020-06-29 09:46   
The format string was correctly updated (and the desired action here does not change it), but the missing opening parenthesis should be added.
(0004891)
geoffclare   
2020-06-29 09:48   
Ignore previous note. I see now that it was not talking about the format string, but the omission of the text "i.e. with a 5 or more digit year".




Viewing Issue Advanced Details
1353 [Issue 8 drafts] System Interfaces Objection Error 2020-06-28 15:10 2020-06-28 15:10
dennisw
 
normal  
New  
Open  
   
Dennis Wölfing
strftime
1964-1965
63875-63884
strftime errno issues
Bug 0000169 added the requirement that strftime sets errno on failure. However there are three issues with this.
1. The RETURN VALUE section implies that strftime can only fail if the buffer is not big enough, which contradicts the ERRORS section.
2. The draft standard now requires that strftime sets errno if the result does not fit into the buffer but it does not specify which value errno is set to.
3. Even tough strftime is required to set errno on failure an application cannot reliably know whether errno contains a meaningful value because a return value of 0 may also mean that a successful conversion resulted in an empty string.

I don't know of any POSIX conforming implementations that currently set errno on failure. However I found that the Windows implementation of strftime sets errno to ERANGE when the buffer is too small.
Note that implementations that currently do not set errno already need to change in order to conform to the draft 1 requirements.
On page 1964 after line 63875 add these two paragraphs with CX shading:
These functions shall not change the setting of errno if successful.

Since 0 is returned on error and is also a valid return on success, an application wishing to check
for error situations should set errno to 0, then call strftime( ) or strftime_l( ), then check errno.


On page 1964 lines 63877-63880 change
If the total number of resulting bytes including the terminating null byte is not more than
maxsize, these functions shall return the number of bytes placed into the array pointed to by s,
not including the terminating NUL character. Otherwise, 0 shall be returned, [CX]errno shall be set to
indicate the error,[/CX] and the contents of the array are unspecified.
to
If all conversions are successful and the total number of resulting bytes including the terminating null byte is not more than
maxsize, these functions shall return the number of bytes placed into the array pointed to by s,
not including the terminating NUL character. Otherwise, 0 shall be returned, [CX]errno shall be set to
indicate the error,[/CX] and the contents of the array are unspecified.


On page 1965 after line 63884 add with CX shading:
[ERANGE]
The total number of resulting bytes including the terminating null byte is more than maxsize.
There are no notes attached to this issue.




Viewing Issue Advanced Details
1352 [Issue 8 drafts] Base Definitions and Headers Comment Error 2020-06-28 15:06 2020-06-28 15:06
dennisw
 
normal  
New  
Open  
   
Dennis Wölfing
wchar.h
445
15632-15643
Outdated text in <wchar.h>
The <wchar.h> APPLICATION USAGE and RATIONALE sections contain text about <wctype.h> functions being declared in <wchar.h> for backwards compatibility. However these obsolescent declarations have been removed in draft 1, so this text is no longer needed.
On page 445 change lines 15632-15643 to:
APPLICATION USAGE
None.
RATIONALE
None.
There are no notes attached to this issue.




Viewing Issue Advanced Details
1351 [Issue 8 drafts] Base Definitions and Headers Objection Omission 2020-06-28 15:04 2020-06-28 15:04
dennisw
 
normal  
New  
Open  
   
Dennis Wölfing
fcntl.h
225
7834
F_DUPFD_CLOFORK missing in <fcntl.h>
Bug 0001318 added F_DUPFD_CLOFORK to the fcntl() function. However the bug was missing a change to add the definition of F_DUPFD_CLOFORK in <fcntl.h>.
On page 225 after line 7834 add:
F_DUPFD_CLOFORK
Duplicate file descriptor with the close-on-fork flag FD_CLOFORK set.
There are no notes attached to this issue.




Viewing Issue Advanced Details
1350 [Issue 8 drafts] Base Definitions and Headers Objection Error 2020-06-26 15:57 2020-06-28 15:35
geoffclare
 
normal  
New  
Open  
   
Geoff Clare
The Open Group
<stdlib.h>
351
12251
O_* constants needed in <stdlib.h> for mkostemp()
This line:
The <stdlib.h> header shall define O_CLOEXEC, O_NOCTTY, and O_RDWR as described in <fcntl.h>.
was added by bug 0000593 with XSI shading because posix_openpt() is XSI.

However, O_CLOEXEC is also needed by mkostemp() which is mandatory. Several other constants are also needed by mkostemp().
On page 351 line 12251 section <stdlib.h>, change:
[XSI]The <stdlib.h> header shall define O_CLOEXEC, O_NOCTTY, and O_RDWR as described in <fcntl.h>.[/XSI]
to:
The <stdlib.h> header shall define the following symbolic constants as described in <fcntl.h>:

O_APPEND
O_CLOEXEC
O_CLOFORK
[SIO]O_DSYNC[/SIO]
[XSI]O_NOCTTY
O_RDWR[/XSI]
[SIO]O_RSYNC[/SIO]
O_SYNC

On page 1283 line 43031 section mkdtemp(), change:
contain additional flags (from <fcntl.h>) to be used
to:
contain additional flags to be used
Notes
(0004889)
dennisw   
2020-06-28 15:35   
0000598 reported the same issue already but was not resolved yet. As reported there dup3() from <unistd.h> also has this issue (it needs O_CLOEXEC and O_CLOFORK).




Viewing Issue Advanced Details
1349 [Issue 8 drafts] Base Definitions and Headers Editorial Enhancement Request 2020-06-26 15:35 2020-06-26 15:37
geoffclare
 
normal  
New  
Open  
   
Geoff Clare
The Open Group
1.1 Scope
4
72
Where to obtain ISO/IEC standards (footnote)
The footnote here says:
ISO/IEC documents can be obtained from the ISO office: ...
and gives a street address in Geneva.

It would make more sense to give a URL for their online store.


Change the line to:
ISO/IEC documents can be obtained from https://www.iso.org/store.html [^] .
(with the URL as a clickable link).
There are no notes attached to this issue.




Viewing Issue Advanced Details
1348 [1003.1(2016)/Issue7+TC2] Shell and Utilities Editorial Error 2020-06-25 08:54 2020-06-25 08:54
geoffclare
 
normal  
New  
Open  
   
Geoff Clare
The Open Group
test
3288
110615
---
"the exec() family of functions" should not include "()"
The phrase "the exec() family of functions" does not fit the normal
convention of omitting the "()" when referring to a family of functions.
Change:
the exec() family of functions
to:
the exec family of functions
There are no notes attached to this issue.




Viewing Issue Advanced Details
1347 [1003.1(2016)/Issue7+TC2] System Interfaces Editorial Clarification Requested 2020-05-28 19:50 2020-05-29 20:26
dalias
 
normal  
New  
Open  
   
Rich Felker
musl libc
stdin
2017
64733
---
stderr access mode - "is expected to be" is not defined
The specification of the standard FILE streams includes the (CX shaded) text:

"The stderr stream is expected to be open for reading and writing."

As far as I can tell, the wording "is expected to be" is not defined anywhere in the standard, and is unclear as to which party may use the expectation and which is responsible for ensuring that it is satisfied.

In addition, it's unclear whether the intent was that the expectation apply to the underlying open file description behind fd 2, or the stderr stdio FILE stream. The latter does not seem useful since dual-mode streams are only usable on seekable files (there's no way to switch from reading to writing without a successful seek) but the use of the word "stream" in the above-quoted suggests it should be read that way. Moreover, even if the stream is open for reading and writing, that does not automatically imply that the file descriptor is.
Clarify which party (implementation, invoking application, or something else) the obligation is on and which party may have the expectation.

Clarify what the consequences are if the expectation is not satisfied.

Clarify whether the statement applies to the FILE stream, the underlying open file description's mode, or both.
Notes
(0004880)
geoffclare   
2020-05-29 10:53   
(edited on: 2020-05-29 12:16)
After a bit of digging (with help from Andrew J) it appears that this wording arose as a result of ERN 40 against XSH6 draft 1, which can be seen here:

http://www.opengroup.org/austin/docs/austin_34r1.txt [^]

and says:
 Problem:
 How is stderr opened: input only (not likely), output only (of course)
 or both (there's the rub)?

 Action:
 My preference: "stderr is opened for input and output, although it is
 expected that under normal circumstances it will not be used for input".
 I'm OK with "stderr is opened for output.  Implementations may open it
 for input as well, but a conforming application should not expect that."

It was submitted by someone called Donn with an Interix email address, which I assume is Donn Terry. If Donn reads this perhaps he can remember what led him to raise this.

My initial guess was that it is is related to the more utility reading commands from "standard error", but since more doesn't have to be implemented in C (and anyway could use read() rather than stdio) there would have been no need to raise the matter in connection with C's stderr but only for file descriptor 2 (as inherited by the login shell).

(0004881)
dalias   
2020-05-29 20:26   
FYI this issue was opened as a result of discussion on a question posted on Stack Overflow, https://stackoverflow.com/questions/62052909 [^] where it was found that at least glibc, FreeBSD, and OpenBSD all have the stderr FILE stream in write-only mode. So it seems that an interpretation applying the text in question as a requirement on the implementation's definition of stderr would be contrary to fairly widespread existing practice.




Viewing Issue Advanced Details
1346 [1003.1(2016)/Issue7+TC2] Base Definitions and Headers Comment Enhancement Request 2020-05-26 13:44 2020-05-26 13:44
markh
 
normal  
New  
Open  
   
Mark Harris
1.7.1
8
236
---
Require support for CLOCK_MONOTONIC
CLOCK_MONOTONIC was introduced in Issue 6, but only as an optional feature. However many applications require the use of a clock that cannot have negative time jumps.

A monotonic clock is not an obscure feature, and is widely available via non-POSIX interfaces. For example, C++ applications (since C++11) can rely on std::chrono::steady_clock, Java applications (since Java 7) can rely on System.nanoTime(), Python applications (since Python 3.5) can rely on time.monotonic(), macOS and iOS applications can rely on mach_absolute_time(), and Windows applications (since Windows 2000) can rely on QueryPerformanceCounter().

Many applications targeting POSIX just assume that CLOCK_MONOTONIC is available, but others that are trying to be more portable use checks that are difficult to get right and test, and may still fail on a system conforming to the latest revision of POSIX if CLOCK_MONOTONIC works with some interfaces (like clock_gettime()) but not others (like clock_nanosleep()). The addition of new interfaces such as pthread_cond_clockwait() (#1216) will make correct checking even more complex for applications that wish to use the new interfaces, but also support the possibility allowed by the standard that they may not be usable with CLOCK_MONOTONIC. A properly functioning monotonic clock is important and applications relying on Issue 8 should not be burdened with additional checks or fallbacks; they should be able to rely on a monotonic clock being available, as C++ applications, Java applications, Python applications, and macOS, iOS, and Windows applications have been able to do for years.
For Issue 8, require support for the Monotonic Clock option, and require that CLOCK_MONOTONIC be supported by all standard interfaces that accept a clock id except clock_settime().
There are no notes attached to this issue.




Viewing Issue Advanced Details
1345 [1003.1(2016)/Issue7+TC2] Shell and Utilities Objection Error 2020-05-24 22:48 2020-07-02 11:02
j_willie
 
normal  
New  
Open  
   
J William Piggott
date
163, 2634-2638
5224-5228, 85673-85823
---
date(1) default format
PREFACE
For simplicity I will state XPG3 positions as fact, although I do not actually
have access to the XPG3 standard. The XPG3 references are not imperative to
addressing this issue, they merely give plausible explanation for the current
situation. Anyone with access to the XPG3 standard is welcome to confirm or
deny. Finally, I'm new here and will likely do something wrong; I plead for
patience and guidance, please.

Terms:
 default format: when calling date(1) without arguments.

There are two closely related issues for the date(1) default format:

a) the default format is ambiguous:
   Stated: shall be the locale specific 'date and time'.
   Not Stated: what locale element represents it.

   Line 85674:
    By default, the current date and time shall be written.
   Line 85805:
    ... the [default format] output in the POSIX locale ...
   Lines 85864-85892
    EXAMPLES [also illustrate locale specific default formats]

b) the STDOUT POSIX locale format string is unreachable
   Line 85807 date "+%a %b %e %H:%M:%S %Z %Y"

PLAUSIBLE EXPLANATION
_______________________________________
XPG3

XPG3 specified the date(1) default output format as %C [uppercase 'C'].
     It was bound to the date_fmt locale element:

date(1)
%C locale’s date and time representation as produced by date(1)

STDOUT
 When no formatting operand is specified, the output in the POSIX locale shall
 be equivalent to specifying: date "+%a %b %e %H:%M:%S %Z %Y"

POSIX Locale
# date and time representation as produced by date(1) (%C)
# "%a %b %e %H:%M:%S %Z %Y"
date_fmt " ...

NOTE: the format in date(1) STDOUT was reachable via LC_TIME date_fmt; %C
_______________
XPG4

In XPG4 %C was redefined to Century number and date_fmt was dropped from the
POSIX locale. However, the change seemed to overlook establishing a new date(1)
default format specifier and did not fix the dangling reference to the deleted
date_fmt in the date(1) STDOUT section.
_______________________________________
IMPACT
The required default date(1) POSIX locale format being unavailable has lead to
implementation creativity. Glibc and coreutils have partially reimplemented the
XPG3 date_fmt as the default date(1) format without binding it to a format
specifier. There have been several posts to the glibc mailing list asking what
are the expected values for d_t_fmt and date_fmt. Nobody knows. I think the
reason date_fmt was reintroduced is to resolve the unreachable format string
the standard requires in the date(1) STDOUT section.
_______________________________________
SOLUTION

a) The de facto date(1) default format must be %c (LC_TIME d_t_fmt).
   It is the only locale keyword that satisfies the required 'date and time'.
   The several date(1) EXAMPLES seem to support this position.

b) date(1) STDOUT should not be restating the POSIX locale. It should
   simply state %c, and let the POSIX locale speak for itself.

   That leaves the conflict between the current format string listed in STDOUT
   and the value for the POSIX locale d_t_fmt. I propose, that because common
   date(1) implementations include %Z in the default output, (as the current
   standard requires); it should be added to the POSIX locale's LC_TIME d_t_fmt
   value.
XBD Page 163
Section 7.3.5.3 LC_TIME Category in the POSIX Locale
Line 5224 replace with:
# "%a %b %e %H:%M:%S %Z %Y"

Line 5228 replace with:
<space><percent-sign><Z><space><percent-sign><Y>"

XCU Page 2634
Section date DESCRIPTION
Lines 85673-85676 replace with:
The date utility shall write the date and time to standard output <XSI>or
attempt to set the system date and time.</XSI> By default, the current date and
time shall be written in the locale dependent %c format. If an operand
beginning with '+' is specified, the output format of date shall be controlled
by the conversion specifications and other text in the operand.

XCU Page 2637
Section date STDOUT
Lines 85805-85807 replace with:
The default standard output shall be in the locale dependent %c format(see
[xref to XSH 3 strftime()] and [xref to XBD 7.3.5.3 LC_TIME Category in the
POSIX Locale]).

XCU Page 2638
Section date APPLICATION USAGE
Lines 85821-85823 replaced with:
The default standard output may be difficult to parse, as it is locale
specific. It may contain <newline> characters and be vastly different from what
is defined for the POSIX locale.

Notes
(0004886)
shware_systems   
2020-06-08 14:49   
(edited on: 2020-06-08 16:06)
On systems that do not support the POSIX2_LOCALEDEF option, it is platform specific what the default formats are for any locales in addition to POSIX the platform makes available. As unspecified platform elections not modifiable by a user there would be no documentation requirement either. I suspect date_fmt was rejected because it constrained how a date utility for such a fixed set of locales might be written. Some implementations might just compile a table of format strings directly into the utility, for example, and not make any references to data setlocale() makes available to a process.

To keep this flexibiity when POSIX2_LOCALEDEF is supported means a platform has to document a) whether date only supports locales provided by the platform, and if so which format is used when a locale created by localedef is referenced; and if not b) the implementation-defined means of mapping a locale name that represents the output of localedef to the platform specific means of providing additional format strings. A platform may elect to extend localedef to accomplish this, others may leverage catget() or use a simple text file format.

Because of these variations I think the Desired Action is too limiting. Adding an LD sensitive documentation requirement is more backwards compatible, I feel.

(0004893)
geoffclare   
2020-07-02 10:57   
(edited on: 2020-07-02 12:02)
The POSIX locale default date utility format:
%a %b %e %H:%M:%S %Z %Y
contains two pieces of information beyond the minimum of date and time required for other locales: the day name (%a) and the time zone (%Z). Most implementations include these in the default date output for their implementation-provided locales.

Since many users would expect to see day name and time zone information in the default date output (particularly if they are used to the traditional behaviour that was standardised for the POSIX locale), as a minimum we should add something to APPLICATION USAGE about this. We should also consider recommending that implementations include them (either via "should" in normative text or a statement in RATIONALE). Here is a suggested set of changes that does the latter...

On page 2638 line 85823 section date, add a paragraph to APPLICATION USAGE:
Since the default date utility format for locales other than the POSIX or C locale is not required to include anything beyond the date and time, whereas for the POSIX or C locale it also includes the day name and time zone, it may be necessary to specify a format (or override the locale-selection environment variables) to ensure this information is included when desired.

On page 2640 line 85914 section date, add these paragraphs to RATIONALE:
Although this standard only requires the default date utility format, for locales other than the POSIX or C locale, to include the date and time, it is common for implementations to include day name and time zone information as well. (For the POSIX locale this is required, with the day name in %a format at the beginning and the time zone in %Z format before the year.) Implementations are encouraged to include the day name (in %a or %A format) and the time zone (in %Z or %z format) in the default date utility format for all of the locales they provide.

Some implementations have a date_fmt locale keyword (see [xref to XBD 7.3.5]) as an extension, to specify the default date utility format for each locale. On such implementations, if the localedef utility is used to create a locale that does not have this information, the date utility must by default still produce output for that locale that includes both the time and the date.


(0004894)
geoffclare   
2020-07-02 11:02   
For the record, most of the historical information given in the Description of this bug is factually incorrect. See the mailing list for details.




Viewing Issue Advanced Details
1344 [1003.1(2008)/Issue 7] System Interfaces Editorial Enhancement Request 2020-05-20 10:34 2020-05-21 13:09
mkerrisk
ajosey  
normal  
Under Review  
Open  
   
Michael Kerrisk
man7.org
XSH
n/a
n/a
---
Addition of setresuid()/setresgid()/getresuid()/getresgid()
setresuid()/setresgid()/getresuid()/getresgid() are implemented on a number systems including at least Linux, FreeBSD, OpenBSD, and HP-UX. (Notably, they are not present on Solaris, so far as I know.)

Adding these interfaces to POSIX would be valuable for a number of reasons:

* The semantics of the existing APIs for modifying credentials are
problematic. The semantics of setuid()/setgid() depend on whether the
process is privileged, so that the API either changes just the
effective ID, or all of real/effective and saved set IDs. The only
POSIX-specified way to change saved set IDs is to use
setreuid()/setregid(). But those APIs depend on a bizarre rule to
determine whether or not the saved set ID is modified. These sorts of
funny behaviors are invitations for programmers to make mistakes, and
in this case such mistakes have obvious security implications.

* By contrast with the former point, the semantics of the changes made
by setresuid() and setresgid() are simple and transparent: one
argument per credential, with "-1" being used to signify "no change".
No semantics that vary according to whether the process is privileged
and no funny rules.

* getresuid()/getresgid() provide the only means of explicitly
retrieving the save set-UID/GID.
1. Add specifications of setresuid()/setresgid()/getresuid()/getresgid()
2. Add SEE ALSO entries in relevant other pages (getuid(), geteuid(), setuid(), setreuid(), getgid(), getegid(), setgid(), setregid(), <unistd.h>)
3. Add prototypes to <unistd.h>
4. Add to "XSI_USER_GROUPS" in "E.1 Subprofiling Option Groups"(?)

I will attempt 1; presumably 2, 3, 4 can be written up as boilerplate editing directions (which I can attempt, but may need some assistance).
Notes
(0004879)
mkerrisk   
2020-05-21 13:09   
On page 448 (<unistd.h> Declarations), after line 15419, insert

int getresgid(gid_t *rgid, gid_t *egid, gid_t *sgid);
int getresuid(uid_t *ruid, uid_t *euid, uid_t *suid);


On page 448 (<unistd.h> Declarations), after line 15443, insert

int setresgid(gid_t rgid, gid_t egid, gid_t sgid);
int setresuid(uid_t ruid, uid_t euid, uid_t suid);


On page 451 (<unistd.h> SEE ALSO) at lines 15579-15581, insert the following
entries into the list in sorted order:

getresgid(), getresuid(), setresgid(), setresuid()

On page 1028 (getegid() SEE ALSO) at line 35033, insert the following
entries into the list in sorted order:

getresgid(), getresuid(), setresgid(), setresuid()

On page 1032 (geteuid() SEE ALSO) at line 35171, insert the following
entries into the list in sorted order:

getresgid(), getresuid(), setresgid(), setresuid()

On page 1033 (getgid() SEE ALSO) at line 35210, insert the following
entries into the list in sorted order:

getresgid(), getresuid(), setresgid(), setresuid()

On page 1104 (getuid() SEE ALSO) at line 37410, insert the following
entries into the list in sorted order:

getresgid(), getresuid(), setresgid(), setresuid()

On page 1890 (setegid() SEE ALSO) at line 61214, insert the following
entries into the list in sorted order:

getresgid(), getresuid(), setresgid(), setresuid()

On page 1893 (seteuid() SEE ALSO) at line 61308, insert the following
entries into the list in sorted order:

getresgid(), getresuid(), setresgid(), setresuid()

On page 1894 (setgid() SEE ALSO) at line 61345, insert the following
entries into the list in sorted order:

getresgid(), getresuid(), setresgid(), setresuid()

On page 1917 (setregid() SEE ALSO) at line 61847, insert the following
entries into the list in sorted order:

getresgid(), getresuid(), setresgid(), setresuid()

On page 1919 (setreuid() SEE ALSO) at line 61908, insert the following
entries into the list in sorted order:

getresgid(), getresuid(), setresgid(), setresuid()

On page 1929 (setuid() SEE ALSO) at line 62155, insert the following
entries into the list in sorted order:

getresgid(), getresuid(), setresgid(), setresuid()


(Depending on whether these APIs should be part of XSI_USER_GROUPS)
On page 3794 (Subprofiling Option Groups) at lines 130234-130234,
insert the following entries into the list in sorted order:

getresgid(), getresuid(), setresgid(), setresuid()


At page 1086, insert the specifications for getresuid() and getresgid():

NAME

getresgid - get real group ID, effective group ID, and saved set-group-ID

SYNOPSIS
#include <unistd.h>

int getresgid(gid_t *rgid, gid_t *egid, gid_t *sgid);

DESCRIPTION

The getresgid() function shall return the calling process's real
group ID, effective group ID, and saved set-group-ID, storing them
in the locations pointed to by, respectively, the arguments rgid,
egid, and sgid.

RETURN VALUE

Upon successful completion, 0 shall be returned. Otherwise, -1
shall be returned and errno set to indicate the error.

ERRORS

No errors are defined.

SEE ALSO

getegid(), geteuid(), getgid(), getresuid(), getuid(), setegid(),
setgid(), seteuid(), setregid(), setresgid(), setresuid(),
setreuid(), setuid()

XBD <sys/types.h>, <unistd.h>


NAME

getresuid - get real user ID, effective user ID, and saved set-user-ID

SYNOPSIS
#include <unistd.h>

int getresuid(uid_t *ruid, uid_t *euid, uid_t *suid);

DESCRIPTION

The getresuid() function shall return the calling process's real
user ID, effective user ID, and saved set-user-ID, storing them in
the locations pointed to by, respectively, the arguments ruid,
euid, and suid.

RETURN VALUE

Upon successful completion, 0 shall be returned. Otherwise, -1
shall be returned and errno set to indicate the error.

ERRORS

No errors are defined.

SEE ALSO

getegid(), geteuid(), getgid(), getresgid(), getuid(), setegid(),
setgid(), seteuid(), setregid(), setresgid(), setresuid(),
setreuid(), setuid()

XBD <sys/types.h>, <unistd.h>



At page 1918, insert the specifications for setresuid() and setresgid():

NAME

setresgid - set real group ID, effective group ID, and saved set-group-ID

SYNOPSIS
#include <unistd.h>

int setresgid(gid_t rgid, gid_t egid, gid_t sgid);

DESCRIPTION

The setresgid() function shall change the calling process's real
group ID, effective group ID, and saved set-group-ID, respectively,
to the values specified by rgid, egid, and sgid.

If one of the arguments is -1, the corresponding group ID shall
not be changed.

Only a process with appropriate privileges can set the real group
ID, effective group ID, and saved set-group-ID to any valid value.


A non-privileged process may set its real group ID, effective
group ID, and saved set-group-ID, each to one of the values that
it currently holds in its real group ID, effective group ID, or
saved set-group-ID.

The real group ID, effective group ID, and saved set-group-ID may
be set to different values in the same call.

RETURN VALUE

Upon successful completion, 0 shall be returned. Otherwise, -1
shall be returned and errno set to indicate the error, and none of
the group IDs shall be changed.

ERRORS

The setresgid() function shall fail if:

[EINVAL]

The value of the rgid, egid, or sgid argument is invalid or out-of-range.

[EPERM]

The calling process does not have appropriate privileges,
and an attempt was made to change the real group ID,
effective group ID, or saved set-group-ID to a value that
is not currently present in one of those IDs.

SEE ALSO

getegid(), geteuid(), getgid(), getresgid(), getresuid(),
getuid(), setegid(), seteuid(), setgid(), setregid(), setresuid(),
setreuid(), setuid()

XBD <sys/types.h>, <unistd.h>

NAME

setresuid - set real user ID, effective user ID, and saved set-user-ID

SYNOPSIS
#include <unistd.h>

int setresuid(uid_t ruid, uid_t euid, uid_t suid);

DESCRIPTION

The setresuid() function shall change the calling process's real
user ID, effective user ID, and saved set-user-ID, respectively,
to the values specified by ruid, euid, and suid.

If one of the arguments is -1, the corresponding user ID shall not
be changed.

Only a process with appropriate privileges can set the real user
ID, effective user ID, and saved set-user-ID to any valid value.

A non-privileged process may set its real user ID, effective user
ID, and saved set-user-ID, each to one of the values that it
currently holds in its real user ID, effective user ID, or saved
set-user-ID.

The real user ID, effective user ID, and saved set-user-ID may be
set to different values in the same call.

RETURN VALUE

Upon successful completion, 0 shall be returned. Otherwise, -1
shall be returned and errno set to indicate the error, and none of
the user IDs shall be changed.

ERRORS

The setresuid() function shall fail if:

[EINVAL]

The value of the ruid, euid, or suid argument is invalid or out-of-range.

[EPERM]

The calling process does not have appropriate privileges,
and an attempt was made to change the real user ID,
effective user ID, or saved set-user-ID to a value that is
not currently present in one of those IDs.

SEE ALSO

getegid(), geteuid(), getgid(), getresgid(), getresuid(),
getuid(), setegid(), seteuid(), setgid(), setregid(), setresgid(),
setreuid(), setuid()

XBD <sys/types.h>, <unistd.h>




Viewing Issue Advanced Details
1343 [1003.1(2016)/Issue7+TC2] Shell and Utilities Editorial Clarification Requested 2020-05-18 10:53 2020-05-18 10:53
joerg
 
normal  
New  
Open  
   
Jörg Schilling
sh
3229
108436
---
The sh environment variable overview is missing OLDPWD
The enviromnent variable OLDPWD is only mentioned with the cd builtin, but missing in the environment variable overview.
Before line 108436 insert:

OLDPWD
    A pathname of the previous working directory, used by cd -.
There are no notes attached to this issue.




Viewing Issue Advanced Details
1342 [1003.1(2016)/Issue7+TC2] Shell and Utilities Objection Error 2020-05-05 08:50 2020-05-05 17:43
kre
 
normal  
New  
Open  
   
Robert Elz
XCU 2.2.3
2346
74721
---
Aliases in command substitutions are handled differently when the command subst is quoted vs not quoted
A 1997 interpretation resulted in the addition of the words:

   not including the alias substitutions in Section 2.3.1,

to the rules for processing a command substitution embedded within double
quotes (XCU 2.2.3).

No similar change was made to section 2.3 (bullet point 5) which handles
the general rules for token recognition when a command substitution is
detected, unquoted.

This means that different rules apply to recognising a command substitution
that happens in a double-quoted string than one that appears unquoted.

This is neither rational, nor what shells implement.

Apart from the unfortunate difference made to the two cases, the 1997
interpretation was almost certainly correct for its time - the relevant shells
of the mid 1990's did not expand aliases while searching for the terminating ')'
of a $() command substitution (backquoted command substitutions are much
simpler and have none of these issues). Shells of the time didn't parse
the command substitution at all while searching for the ')' - they relied
upon '(' and ')' counting.

In the intervening 20-something years that technique has been shown to be
fatally flawed, not only do (some - unlikely but possible) aliases potentially
break this, but case statements (not using the optional '(' preceding the
pattern) and here documents (which can contain almost anything - but can
only be recognised as being a here document by parsing the accompanying
command) break this simple scheme - shells must fully parse a $() command
substitution to have any hope of correctly finding the terminating ')' in
all cases.

Note that whether the shell retains the results of this parse, or simply
copies the command substitution string literally, saving it to be parsed
again later, is immaterial to this issue - the correct ')' must be located
to properly terminate the command substitution, whatever is done with the
text of the command substitution, or the results of parsing it, during that
process.
In line 74271, on page 2346 of XCU 2.2.3 (2016 edition) delete the words:

    not including the alias substitutions in Section 2.3.1,

This is needed regardless of any other changes that might be made, as even
if it is (was) the correct behaviour, it was applied in the wrong place.

That is all that is needed to handle this issue.

However, as an option, in XCU 2.3, bullet point 5, on page 2348, after
the sentence (lines 74774-6):

    While processing the characters, if instances of expansions or quoting are
    found nested within the substitution, the shell shall recursively process
    them in the manner specified for the construct that is found.

add a new sentence:

    It is unspecified whether aliases [xref XCU 2.3.1] shall be expanded
    while performing this process.

It needs to be unspecified, rather than "shall not process" as most shells
do process aliases at this point, as they are when the command substitution
is eventually parsed and executed, if different rules apply strange hard
to fathom errors can occur.

However, not all shells do, so making it unspecified might be the wise choice
for now - even if such shells tend to be broken in this regard. Since alias
processing when the command substitution was in a double quoted string was
(or is) prohibited by the standard, suddenly making it mandated when there are,
or might be, some shells which actually implemented this prohibition would
also seem harsh.

If this option is adopted, two further changes should be made.

In the application notes (wherever they are for this) there should be an
admonition

     Applications shall not include aliases which contain unbalanced
     syntax components in any $() command substitution.

What that means can be expanded if deemed appropriate. The idea is that
trivial aliases "alias l=ls" (etc) are harmless, and not a problem, it is
only ones like "alias switch=case", "alias subshell='('", or
"alias forever='while true do;'" that cause problems. The "unbalanced"
is because an alias like "alias set13='(exit 13)'" is not a problem,
despite having parentheses in its value.

And second, in the Future Directions (wherever that is for sh) it should
say

     A future version of this standard may require alias expansion
     while scanning for the terminating ')' in a $() command substitution.

Last, some kind of explanation of all that happened here should be added
in the rationale (somewhere in XRAT I assume).


An alternative option open to the group who decides these things would be
to simply delete aliases from the standard completely. Since I doubt this
one will happen, I won't bother providing the changes needed to implement it.
Notes
(0004856)
geoffclare   
2020-05-05 10:57   
We also need to do something about the statement for $() command substitution that "Any valid shell script can be used for command". This is not true because a shell script which defines an alias which changes the syntax, and then uses it, might not be parsed correctly without the alias command being executed.
For example a script containing:
alias subshell='('
subshell echo foo )
subshell echo bar )
works fine (note you need POSIXLY_CORRECT set for bash), but:
echo $(
alias subshell='('
subshell echo foo )
subshell echo bar )
)
does not.
(0004859)
kre   
2020-05-05 15:36   
Re Note: 0004856

I suspect that's already handled by the (new, I forget which bug it is)
wording that specifies when alias commands take effect.

This is just the same issue as prevents

if [ $SHELL = myshell ]; then
    alias xx=mycommand
    xx foo bar
fi

from working. Or from defining, and then using, an alias in a function.

Aliases are evil - let's delete them!!!
(0004860)
geoffclare   
2020-05-05 15:52   
Re Note: 0004859

Bug 0000953 does indeed change that text, but the new text still has the same problem. It now says:
It is unspecified whether command is parsed and executed as a program (as for a shell script) or is parsed as a single compound_list that is executed after the entire command has been parsed. With the $(command) form any valid program can be used for command, except a program consisting solely of redirections which produces unspecified results.
(0004863)
kre   
2020-05-05 17:43   
OK, we can add text that makes it clear (just in case it wasn't
already obvious) that the entire command in a command substitution is
parsed before any of it is executed. And given that, alias commands
in the command substitution cannot work - the command substitution is
a subshell environment, so the parent shell cannot be affected, and
any aliases defined there get defined too late to be used within the
command substitution.

Alias in a sub-shell block '(' code ')' have the same uselessness.

Aliases in general are largely useless, let's delete them!!!




Viewing Issue Advanced Details
1340 [1003.1(2016)/Issue7+TC2] Base Definitions and Headers Objection Clarification Requested 2020-05-04 16:28 2020-06-17 14:02
shware_systems
 
normal  
Applied  
Accepted As Marked  
   
Mark Ziegast
SHware Systems Dev.
XBD 8.3
178
5854-7
Approved
Note: 0004871
PATH specification has an ambiguity.
PATH currently has:
If the pathname being sought contains a <slash>, the search through the path prefixes shall not be performed. If the pathname begins with a <slash>, the specified path is resolved (see Section 4.13, on page 111).

In discussing bug 1208 it was noted the first sentence is in error, as all pathname's being checked will have a <slash> between prefix and filename, and a prerequisite for PATH being referenced is that a filename string has no <slash> characters. Therefore this sentence is superfluous.

The second sentence does not include the case where a pathname does not begin with a <slash> and is a relative reference. This leaves ambiguous what root directory as an additional prefix that begins with a <slash> is presumed: the current directory '.', the root directory '/', or a user's home directory, ala '~', as example choices. Actual practice is these are also resolved in accordance with XBD 4.13, historically.
Replace the two sentences with:

Each pathname to be sought shall be resolved in accordance with <xref>XBD 4.13 Pathname Resolution</>, on page 111.
Notes
(0004846)
kre   
2020-05-04 19:31   
While I would not object to the definition of PATH being rewritten in
a slightly more logical form, nothing in the Description here makes
any sense.

I have no idea what discussions related to 0001208 happened to be, or
why an issue relating ti chdir in posix_spawn should care in the least
about PATH, but the "pathname being sought" is typically "ls" or "cat" or "sh"
none of which contains a '/' - so the first sentence is not "in error",
there is no prefix, just the command name (you do understand how PATH works,
right?)

When the pathname begins with a slash there is nothing to do, no search is
made (that's for when someone says /bin/sh) - when there is some other slash
in the pathname, resolution happens the same as for any other filename
reference - since there is no leading '/' (we know that, that case was
already handled) the path lookup starts at the current working directory.

There is no ambiguity here, and while, once again, the paragraph could be
more clearly written (and should xref XCU 2.9.1.1) it doesn't contain any
actual errors.
(0004847)
kre   
2020-05-04 19:33   
Actually, are you perhaps under the impression that the paragraph
in question is about pathnames in general ?

It is about the environment variable PATH - a very specific thing.
(0004848)
shware_systems   
2020-05-04 20:05   
The Desired Action has already been discussed and reached consensus, maybe with a couple wording tweaks. This report is simply the formal record the change is considered required, for the reasons given in the Description.

This came up because sh does do PATH searches and may use *spawn() to exec utilities. The new Final Text for Bug 1208 is related mostly to how PATH is applicable.
(0004849)
kre   
2020-05-05 03:00   
Then the resolution is wrong, and wasn't properly considered.

You could *add* a sentence like that which is proposed, perhaps even
replacing the final sentence, but the earlier text (while it could be
improved) is not incorrect and should not be deleted.
(0004850)
shware_systems   
2020-05-05 03:16   
In that context filename refers to the desired utility, pathname is prefix (possibly with appended slash) concatenated with filename. It was argued filename might replace pathname and keep the basic statement, but that just made it obvious it was redundant so is considered superfluous.
(0004851)
kre   
2020-05-05 04:39   
(edited on: 2020-05-05 05:13)
All filenames are pathnames. But yes, I agree the switch in terminology
is one of the things I would fix if rewording this is considered worthwhile.

But it is clear that the "pathname" in the later sections is intended to
mean the same thing as the "filename" in the initial part.

The text is redundant (to some extent) if and only if an xref to XCU 2.9.1.1
is added to replace it... but I am not sure that that section (which explains
how sh uses PATH for lookups) necessarily applies precisely to all other uses,
as it includes references to built in utilities, functions, etc, so I think
it would be better to retain the XBD definition for those other purposes,
and simply include an xref to add clarity - perhaps something like

    See [xref XCU 2.9.1.1] for a description of how sh [xref XCU 2] handles
    PATH searches.

(wordsmithed as appropriate).

ps: Not relevant directly here, but to respond to a comment made in Note: 0004848
it is hard to see how a sh could ever use posix_spawn as Note: 0004830 implies
while retaining the view that some people have of how and when builtin
utilities should be executed. It would be possible if the shell simply
looks for a builtin before starting the PATH search, but if it needs to
combine the PATH search with builtin recognition, if it was relying upon
posix_spawn to do the PATH search, I fail to see how a builtin would ever
be able to be executed - the only way would be to do the PATH search twice,
and I don't think that works either (the only truly reliable way to test
whether a PATH element + filename combination works (from userland)
is to try to exec it, and there's no returning from that if it does work.

It also isn't clear how a posix_spawn PATH scanning implementation would ever
maintain the hash table properly (unless of course, in both cases, the PATH
resolution is done as a separate step, and posix_spawn is only ever called
(from sh) with a fully qualified pathname (ie: just to replace the fork/exec
sequence, and fd/dir manipulations that fit between those) but in that case,
there would be no motivation for fiddling the definition of PATH based upon
what posix_spawn sees, as it would only every hit the "contains a slash" and
"begins with a slash" sections (ie: PATH is irrelevant, and what's more,
Note: 0004830 is incorrect, see: Note: 0004830 is incorrect">0001341) - so if you were to
believe that this definition is solely for the purposes of posix_spawn then
I guess the change would make sense. But it isn't, it is also used by
popen() system() find env ...

[Aside: mantis seems to have issues formatting this note correctly, I have
tried several different ways to write the text, yet it always seems to mangle
that final "see Note: 0004830 is incorrect">0001341" reference. Oh, I see the trigger now, it is
because the referenced bug (1341) contains bugid and bugnote references in its
Summary line; is that not supported - it works in the issue itself?]

(0004854)
geoffclare   
2020-05-05 08:40   
The description in this bug is reasonably close to what was discussed in the telconference. However, the desired action is most definitely not what we thought Mark was going to put in this bug.

The statement "If the pathname being sought contains a <slash>, the search through the path prefixes shall not be performed" is both erroneous and redundant. It is erroneous because the thing that is sought is a filename, not a pathname, and therefore by definition cannot contain a slash. It is redundant because all of the places in the standard that require a PATH search to be done already state that no PATH search is done if there is a slash in the string that would be the subject of the search. I.e the description of PATH never needs to be read in the context of something with a slash being sought.

The statement "If the pathname begins with a <slash>, the specified path is resolved" is problematic because it only requires pathname resolution to be done for absolute pathnames, not relative ones. The meeting agreed that this needs to be changed in some way so that it applies to relative pathnames as well.

Having thought about the issue a little more, I believe just changing that sentence is not the right solution to the second problem. The time when pathname resolution is done is during the search, not after the pathname has been found. (I.e. it can only be "found" by performing pathname resolution on each constructed pathname.)

I will add another note with a new proposed resolution.
(0004855)
geoffclare   
2020-05-05 08:46   
(edited on: 2020-05-05 08:47)
New proposed resolution...

On line 5851 change:
The list shall be searched from beginning to end, applying the filename to each prefix, until an executable file with the specified name and appropriate execution permissions is found. If the pathname being sought contains a <slash>, the search through the path prefixes shall not be performed. If the pathname begins with a <slash>, the specified path is resolved (see Section 4.13, on page 111).
to:
The list shall be searched from beginning to end, applying the filename to each prefix and attempting to resolve the resulting pathname (see Section 4.13, on page 111), until an executable file with appropriate execution permissions is found.


(0004861)
shware_systems   
2020-05-05 16:19   
I've no objection to the new wording; it was my understanding all the text preceding the two sentences quoted was considered adequate so I left it alone for original proposal.
(0004862)
kre   
2020-05-05 16:21   
Re Note: 0004854

To take the first couple of issues in reverse order, as it makes
more sense (to me) this way...

    It is redundant because all of the places in the standard that require
    a PATH search to be done already state that no PATH search is done if
    there is a slash in the string that would be the subject of the search.

To me that has produced exactly the inverse to the correct response. If
everything that uses PATH needs to specify that it is only used when there
is no slash in the name, then lets simplify all of those, and codify the rules
for PATH processing in a single place, so everyone gets a *known* consistent
interpretation and we don't have people quibbling about why the wording for
the use in context X is subtly different from the wording for contest Y, and
deducing from that, that the behaviour is supposed to be different. And
if we're certain that the identical wording is used in every case, then that's
an even better reason for consolidating it and removing the duplication.

Further, this is the safe way to make the change - if we have missed one of
the uses (and so don't delete its redundant wording) then we're left with
some redundant text, but that's generally harmless. On the other hand if
we have missed a use which turns out not to specify that no path search is
done when the pathname being sought contains a slash, and we delete the
text in the definition of PATH which deals with that, then we've just caused
the standard to be broken. Since the standard is very large, and I'm sure
the word PATH appears in it very many times, I'd hate to guarantee (as in,
my life depends upon it) that we know every place where a path search is
specified, and if we are not 100% certain that we have found every single one,
then neither can we be certain that every single one has the "no slash"
qualification.

And last, it means that someone who simply wants to know what this environment
variable PATH is all about, and (reasonably) turns to XBD 8.3 to get the
answer, will actually discover the whole picture, and not end up missing
the crucial element that a search for a/b only looks for "a/b" and doesn't
try PATH[1]/a/b PATH[2]/a/b (etc) (using invented notation for the ':'
delimited sub-strings of PATH).

Next:
    It is erroneous because the thing that is sought is a filename,
    not a pathname, and therefore by definition cannot contain a slash.

This is part of the "poor wording" I remarked on in earlier notes. It isn't
really as bad (and isn't actually incorrect) it is just hard to read
correctly easily (and currently depends upon the reader correctly
understanding the difference between pathname and filename). PATH What we
do about it depends upon the resolution of the previous point. If we
agree to consolidate the definition of how PATH works (all of it) in XBD 8.3,
which is what I'd suggest should be done, then the thing that is sought is
a pathname, not a filename (as they're defined in XBD 3.170 and 3.271), and
the correct response to this problem is to fix that, and use "filename" only
after we have excepted the case where there's a slash in the pathname - and
I'd make the wording be clear about the change of terminology, and why
something perhaps like:

PATH This variable shall represent the sequence of path prefixes that
       certain functions and utilities apply in searching for an executable
       file. The prefixes shall be separated by a <colon> (':'). If the
       pathname being sought contains no slash ('/') characters, and hence
       is a filename, the list of prefixes shall be searched from beginning
       to end, applying the filename to each prefix, until an executable file
       with the specified name and appropriate execution permissions is found.
       When a non-zero-length prefix is applied to the filename, a <slash>
       shall be inserted between the prefix and the filename if the prefix
       did not end in <slash>. A zero-length prefix is a legacy feature that
       indicates the current working directory. It appears as two adjacent
       <colon> characters ("::"), as an initial <colon> preceding the rest
       of the list, or as a trailing <colon> following the rest of the list.
       A strictly conforming application shall use an actual pathname
       (such as .) to represent the current working directory in PATH.
       If the pathname being sought contains a <slash>, and so is not a
       filename, the search through the path prefixes shall not be performed.

and then continue as it is now, as amended by the feed for ...

   The statement "If the pathname begins with a <slash>, the specified
   path is resolved" is problematic because it only requires pathname
   resolution to be done for absolute pathnames, not relative ones.

More poor wording, and I have no problem with fixing that.

However, I don't think we should discard mention of what happens with a
PATH search when the pathname being sought contains a '/' and rely upon
the reader just understanding that since it says "filename" thatmust be
true, while being left in the dark (until they find a use of PATH search)
about what happens if there is a '/'. And then wonder why it is divided
up that way, and if perhaps different applications of PATH searching might
have different rules for what to do in that case.

That is, I don't believe the proposed resolution in Note: 0004855 is good enough.
(0004864)
geoffclare   
2020-05-06 08:18   
Re Note: 0004862

Seems I didn't look hard enough for the places that require a PATH search. I have now found a place that doesn't say PATH is not searched if there is a slash in the search string: find -exec (and -ok), where the requirement that a PATH search is done is only made via the PATH entry in ENVIRONMENT VARIABLES - it's not in the description of -exec. There is also a slight problem with fc -e and xargs because they both state that the utility to be invoked is specified by its name, implying that portable applications can't use fc -e or xargs to invoke a utility by means of a pathname that includes a slash. Once those are fixed to allow a pathname, they would also need to specify the no-slash rule for PATH searching, unless it is covered by the description of PATH.

So I am now in agreement that the description of PATH should handle this.

Your suggestion of "If the pathname being sought contains no slash ('/') characters, and hence is a filename, the list of prefixes shall be searched ..." is okay with me as a way to do that. (I didn't compare the rest of your new PATH description with the current one to see what else you changed - it would be helpful if you could propose the changes you want in smaller chunks so there is less unchanged text to compare.)

The problems with fc -e and xargs could either be addressed here or in a separate bug.
(0004866)
kre   
2020-05-06 14:39   
Here are the editing instructions to get to text shown in Note: 0004862
as requested in Note: 0004864

Al of this applies to the description of the environment variable PATH
in XBD 8.3, on page 178 (2016 edition).

In lines 5843-4 delete the words "known only by a filename".

In line 5844, after the sentence that ends "...separated by a <colon> (':')."
insert the words (beginning a new sentence)
     If the pathname being sought contains no slash ('/') characters,
     and hence is a filename,
and then follow that by moving the sentence that starts on line 5851
     The list shall be searched...
(up to the end of that sentence, on line 5854)
     ...permissions is found.
While moving that sentence, change the capital 'T' in the leading "The"
to lower case, as it is no longer the beginning of a sentence, just a claise.

That's it, the rest of the text is unchanged (since in Note: 0004862 I did not
attempt to fix the "resolve the path" issue.

What I might do for that is to replace the sentence that runs from line
5855-7, viz:

    If the pathname begins with a <slash>, the specified path is
    resolved (see Section 4.13, on page 111).

and replace it with

     In each case, either the result of appending the filename to each prefix
     from PATH when the pathname (filename) contains no slash characters,
     or the pathname when it does contain one or more slash characters,
     is resolved (see Section 4.13, on page 111).

but I am less happy with that wording than the earlier part.
Someone please suggest something better.
(0004868)
geoffclare   
2020-05-07 08:50   
(edited on: 2020-05-07 08:51)
Here's a suggestion that combines kre's solution from Note: 0004866 for the filename/pathname problem with my suggestion from Note: 0004855 for the pathname resolution problem...

All of this applies to the description of the environment variable PATH in XBD 8.3, on page 178 (2016 edition).

In lines 5843-4 delete the words "known only by a filename".

In line 5844, after the sentence that ends "...separated by a <colon> (':')."
insert the new sentence
If the pathname being sought contains no slash ('/') characters, and hence is a filename, the list shall be searched from beginning to end, applying the filename to each prefix and attempting to resolve the resulting pathname (see Section 4.13, on page 111), until an executable file with appropriate execution permissions is found.

Delete the sentence that starts on line 5851
     The list shall be searched...
(up to the end of that sentence, on line 5854)
     ...permissions is found.

Replace the two sentences that run from line 5854-7, viz:
If the pathname being sought contains a <slash>, the search through the path prefixes shall not be performed. If the pathname begins with a <slash>, the specified path is resolved (see Section 4.13, on page 111).
with:
If the pathname being sought contains any <slash> characters, the search through the path prefixes shall not be performed and the pathname shall be resolved as described in Section 4.13, on page 111.

The end result of these edits is the following as the first paragraph of the PATH description:
This variable shall represent the sequence of path prefixes that certain functions and utilities apply in searching for an executable file. The prefixes shall be separated by a <colon> (':'). If the pathname being sought contains no slash ('/') characters, and hence is a filename, the list shall be searched from beginning to end, applying the filename to each prefix and attempting to resolve the resulting pathname (see Section 4.13, on page 111), until an executable file with appropriate execution permissions is found. When a non-zero-length prefix is applied to this filename, a <slash> shall be inserted between the prefix and the filename if the prefix did not end in <slash>. A zero-length prefix is a legacy feature that indicates the current working directory. It appears as two adjacent <colon> characters ("::"), as an initial <colon> preceding the rest of the list, or as a trailing <colon> following the rest of the list. A strictly conforming application shall use an actual pathname (such as .) to represent the current working directory in PATH. If the pathname being sought contains any <slash> characters, the search through the path prefixes shall not be performed and the pathname shall be resolved as described in Section 4.13, on page 111. If PATH is unset or is set to null, the path search is implementation-defined.


(0004869)
kre   
2020-05-07 10:40   
That (Note: 0004868) is fine with me.
(0004871)
geoffclare   
2020-05-11 15:19   
Interpretation response
------------------------
The standard is unclear on this issue, and no conformance distinction can be made between alternative implementations based on this. This is being referred to the sponsor.

Rationale:
-------------
The current text does not require constructed relative pathnames to be resolved, only absolute ones.

Notes to the Editor (not part of this interpretation):
-------------------------------------------------------
Make the changes in Note: 0004868
(0004872)
ajosey   
2020-05-11 15:27   
Interpretation proposed: 11 May 2020
(0004888)
ajosey   
2020-06-12 09:09   
Interpretation approved: 12 June 2020




Viewing Issue Advanced Details
1339 [Online Pubs] Main Index Editorial Error 2020-05-01 18:31 2020-05-04 15:09
andras_farkas
 
normal  
Applied  
Accepted  
   
Andras Farkas
bzip2 download has not been updated
I normally use the .tar.bz2 download here:
https://pubs.opengroup.org/onlinepubs/9699919799/download/index.html [^]
I thought "huh, that's weird, some bugs aren't fixed in it that are fixed on the site. Was it not regenerated?"
I checked all three other downloads, and those three all match each other as being up to date. But the .tar.bz2 download alone has not been regenerated for some time: at least half a year.
Place updated .tar.bz2 file on the website.
Notes
(0004844)
ajosey   
2020-05-04 15:09   
It seems bzip2 does not like to overwrite existing files. I have fixed the file generator to remove the files before it creates them.




Viewing Issue Advanced Details
1338 [Online Pubs] System Interfaces Editorial Error 2020-05-01 18:13 2020-05-04 17:11
andras_farkas
 
normal  
Applied  
Accepted  
   
Andras Farkas
creat, open
creat and open have similar missing BR element between includes
I was researching the OH (optional header) tag, since I was curious, sometime after this
https://austingroupbugs.net/view.php?id=1335 [^]
and noticed that both
https://pubs.opengroup.org/onlinepubs/9699919799/functions/creat.html [^]
https://pubs.opengroup.org/onlinepubs/9699919799/functions/open.html [^]
have the same small error.
They lack a BR (or newline) between the two includes.

Also, out of curiosity, why don't optional headers get a hyperlink to their XBD page like other headers do? I don't see why not.
Add BR element between the two includes.
Notes
(0004845)
ajosey   
2020-05-04 17:11   
These corrections have been made and the download bundles also updated.

The hyperlinking can always be improved.




Viewing Issue Advanced Details
1337 [1003.1(2016)/Issue7+TC2] System Interfaces Objection Omission 2020-04-29 17:41 2020-05-18 15:23
eblake
 
normal  
New  
Open  
   
Eric Blake
Red Hat
accept vs setsockopt
XSH 2.10.16
528
18538
---
Clarify socket option values after accept()
While the standard is clear that all sockets have default settings for socket options, and that some defaults are implementation-defined (for example, "The default value for the SO_RCVBUF option value is implementation-defined, and may vary by protocol." line 18628), it is silent on whether a non-default socket option set on a listening socket is required/permitted to be inherited into the new socket created by accept(). As there are existing implementations which inherit some but not all socket options (see https://stackoverflow.com/questions/5968132/are-socket-options-inherited-across-accept-from-the-listening-socket), [^] the best course of action is to just clarify that portable applications cannot rely on option inheritance.

Also, the text states "All of the options have defaults" (line 18538) then contradicts itself "SO_TYPE has no default value" (line 18672).
[note: this change assumes that 0000411 adding accept4() is applied]

Change page 528 line 18538 (XSH 2.10.16 Use of Options) from:
All of the options have defaults.
to:
All of the options usable with setsockopt( ) have defaults. For each option where a default value is listed as implementation-defined, the implementation also controls whether a socket created by accept( ) or accept4( ) starts with the option reset to the original default value, or inherited as the value previously customized on the original listening socket.


At page 565 line 19882 [XSH accept() DESCRIPTION), add a paragraph:
It shall be implementation-defined which socket options, if any, on the accepted socket will have a default value determined by a value previously customized by setsockopt( ) on socket, rather than the default value used for other new sockets.


At page 569 line 19912 [XSH accept() APPLICATION USAGE), add a new paragraph:
Many socket options are described as having implementation-defined default values, which may differ according to the protocol in use by the socket. Existing practice differs on whether socket options such as SO_SNDBUF that were customized on the original listening socket will impact the corresponding option on the newly returned socket. Implementations are permitted to allow inheritance of customized settings where it makes sense, although the most portable approach for applications is to limit setsockopt( ) customizations to only the accepted socket.


At page 1924 line 62032 [XSH setsockopt() APPLICATION USAGE), add a new paragraph:
It is implementation-defined which socket options, if any, are inherited from a listening socket to an accepted socket by accept( ) or accept4( ).

There are no notes attached to this issue.




Viewing Issue Advanced Details
1336 [1003.1(2016)/Issue7+TC2] System Interfaces Objection Error 2020-04-23 13:09 2020-04-23 13:09
dennisw
 
normal  
New  
Open  
   
Dennis Wölfing
getrusage
1089
36960-36962
---
getrusage should recursively include information about children of children
The specification for getrusage says that for RUSAGE_CHILDREN information about waited-for
children of the current process is returned. However existing implementations also recursively
include children of children.

In XBD 3.93 Child Process is defined as
A new process created (by fork( ), posix_spawn( ), or posix_spawnp( )) by a given process. A child
process remains the child of the creating process as long as both processes continue to exist.
which makes clear that children of children are not considered children of the current process.
On page 1089 lines 36960-36962 section getrusage, change
If the
value of the who argument is RUSAGE_CHILDREN, information shall be returned about
resources used by the terminated and waited-for children of the current process.
to
If the
value of the who argument is RUSAGE_CHILDREN, information shall be returned about
resources used by the terminated and waited-for children of the current process and
recursively the terminated and waited-for children thereof.
There are no notes attached to this issue.




Viewing Issue Advanced Details
1335 [Online Pubs] System Interfaces Editorial Error 2020-04-20 01:29 2020-04-30 16:27
andras_farkas
 
normal  
Applied  
Accepted  
   
Andras Farkas
fstatat
Missing newline or BR element on fstatat() page
On the fstatat() page:
https://pubs.opengroup.org/onlinepubs/9699919799/functions/fstatat.html [^]
The first line after SYNOPSIS looks like this:
[OH] #include <fcntl.h> #include <sys/stat.h>
It should look like this:
[OH] #include <fcntl.h>
#include <sys/stat.h>
Remove the space character between the two includes, and replace it with a HTML BR element between the first and second includes.
Notes
(0004826)
Don Cragun   
2020-04-20 01:51   
This appears correctly in the PDF on P965, L32786-32787.

I'll move this to the Online Pubs project.
(0004827)
andras_farkas   
2020-04-20 04:02   
Thanks, Don!
Where do I find the PDF, by the way? It's not available here:
https://pubs.opengroup.org/onlinepubs/9699919799/download/index.html [^]
(0004828)
Don Cragun   
2020-04-20 05:36   
(edited on: 2020-04-20 05:37)
Re: Note: 0004827:
You can buy the PDF for the standard from IEEE or ISO. (I don't remember what they charge, but it isn't cheap.)

If you have a login for opengroup.org, you can sign in, click on "Standards", click on "UNIX Standards", and click on "Learn More" under the heading UNIX BASE SPECIFICATIONS ISSUE 7 2018 EDITION to find pointers to the PDF and to the online HTML that you have already found.

(0004837)
ajosey   
2020-04-30 16:11   
Change applied to the html edition
(0004842)
ajosey   
2020-04-30 16:27   
The html download bundles have also been updated




Viewing Issue Advanced Details
1334 [Online Pubs] Rationale Editorial Error 2020-04-07 19:28 2020-04-08 08:29
andras_farkas
 
normal  
Applied  
Accepted As Marked  
   
Andras Farkas
Portability Considerations
Possible bad troff to HTML conversion in Portability Considerations section of XRAT
In D.2.10 Command Language:
https://pubs.opengroup.org/onlinepubs/9699919799/xrat/V4_port.html#tag_24_02_10 [^]
In the list of commands, we have:

sleep, tee, test, CONVERSION ERROR (.Cm) time *,1 true, wait,

The conversion error text seems erroneous. time is also not a hyperlink to time.html, unlike the other commands in the list.
Get rid of
"CONVERSION ERROR (.Cm) "
text and make time a hyperlink to the intended time utility.
Notes
(0004815)
Don Cragun   
2020-04-07 23:35   
In the PDF (P3771, L129440-129443) the only hyperlink is to Section 2.14. It would be nice if the utilities in that list were hyperlinks as in the on-line version.

The conversion error does not appear in the PDF. Note that time utility has a footnote in both the HTML and the PDF, but there is an extra space between "time" and "*" in the HTML that isn't in the PDF. I would guess that there is a bug in the .Cm macro processing when producing the HTML.
(0004819)
geoffclare   
2020-04-08 08:04   
(edited on: 2020-04-08 08:05)
It looks like the HTML translation tool doesn't correctly handle a footnote marker in a macro argument. The source here has:
.Cm time *,\*F
I looked for other places where "\*F" appears on a line beginning with "." and found there is only one other such place, and it also has a translation problem. It's in XRAT Section E Subprofiling Considerations under POSIX_SYMBOLIC_LINKS, where the troff source:
.Fn lchown ,\*F
is rendered in the HTML version as:

,,()href="#tag_foot_2">2

(0004820)
agadmin   
2020-04-08 08:12   
This conversion error and two other conversion errors in XRAT have been corrected.
(0004821)
agadmin   
2020-04-08 08:28   
The lchown conversion has also been fixed.

Files updated in XRAT:
V4_subprofiles.html
V4_xbd_chap12.html
V4_xbd_chap03.html
V4_port.html

Download bundles have been updated.




Viewing Issue Advanced Details
1333 [Online Pubs] Shell and Utilities Editorial Error 2020-04-07 19:00 2020-04-08 08:30
andras_farkas
 
normal  
Applied  
Accepted As Marked  
   
Andras Farkas
talk
talk.html page seems to have a missing newline in example text
In
https://pubs.opengroup.org/onlinepubs/9699919799/utilities/talk.html [^]
the example text below

Message from <unspecified string>
talk: connection requested by your_addresstalk: respond with: talk your_address

seems to be missing a newline in the second of the two lines.

(Do forgive me for the noise if I'm wrong! I've never actually gotten talk to work on any machine I've used. And thanks so much for helping me out with my previous bug reports.)
Change the example text to

Message from <unspecified string>
talk: connection requested by your_address
talk: respond with: talk your_address

by adding a single newline.
Notes
(0004812)
shware_systems   
2020-04-07 19:08   
I get similar. At the least there should be a SPC between your_address in italics and second talk, if message can all fit on one line.
(0004813)
kre   
2020-04-07 19:29   
This looks to be an issue with the HTML translation, the PDF version is correct.
(0004814)
Don Cragun   
2020-04-07 23:11   
As kre said in Note: 0004813, this text appears correctly in the PDF on P3281, L110387-110389. Therefore this bug has been moved from the 1003.1(2016)/Issue7+TC2 Project to the Online Pubs Project.
(0004816)
Konrad_Schwarz   
2020-04-08 07:28   
The mailx page has similar issues:
in the "Commands in mailx" section, the variant command forms are shown
all in the same line without intervening space,
they should probably each be on a separate line.
(0004817)
agadmin   
2020-04-08 07:44   
The talk html page has been updated
(0004818)
agadmin   
2020-04-08 07:58   
The mailx page has been updated
(0004822)
agadmin   
2020-04-08 08:29   
The talk and mailx pages were updated in the utilities directory
The Download bundles have been updated.




Viewing Issue Advanced Details
1332 [1003.1(2016)/Issue7+TC2] System Interfaces Objection Error 2020-04-07 10:51 2020-06-04 08:43
geoffclare
 
normal  
Applied  
Accepted  
   
Geoff Clare
The Open Group
waitid
2236
71333
---
WEXITED should not be limited to processes that have exited
The description of WEXITED on the waitid() page (and <sys/wait>) is:

    Wait for processes that have exited.

The use of "exited" here means that waitid() calls with WEXITED are
not required to return status information for processes that have
been terminated by a signal.
At the specified location, and also at
page 409 line 13906 section <sys/wait.h>, change:

    Wait for processes that have exited.

to:

    Wait for processes that have terminated.
Notes
(0004811)
shware_systems   
2020-04-07 16:01   
I agree that could be misconstrued, but so could the desired action. I'd prefer:
Wait for processes that have exited normally, called abort(), or have been terminated due to a signal.




Viewing Issue Advanced Details
1331 [1003.1(2016)/Issue7+TC2] Shell and Utilities Editorial Error 2020-04-01 09:14 2020-06-04 08:41
geoffclare
 
normal  
Applied  
Accepted  
   
Geoff Clare
The Open Group
pax
3082, 3084
102809, 102881
---
pax atime and mtime keyword descriptions should refer to st_atim and st_mtim
When the standard was updated to support finegrained timestamps, a change to pax was missed where it refers to the st_atime and st_mtime members of the stat structure but should now refer to st_atim and st_mtim.

Since the text uses the word "member" and there are no stat structure members called st_atime and st_mtime, it is not possible to misinterpret the text as referring somehow to the st_atime and st_mtime macros, and so this should be treated as a minor editorial error.

(There is also a reference to an "st_mtime field" in the ar RATIONALE section, but I believe that one should not be changed as it is describing the historical 4.4BSD ar format, and the 4.4BSD stat structure had an st_mtime member, not an st_mtim member.)
On page 3082 line 102809 section pax, change:
the st_atime member
to:
the st_atim member

On page 3084 line 102881 section pax, change:
the st_mtime member
to:
the st_mtim member
There are no notes attached to this issue.




Viewing Issue Advanced Details
1330 [1003.1(2016)/Issue7+TC2] Base Definitions and Headers Editorial Enhancement Request 2020-03-31 15:53 2020-05-29 15:35
geoffclare
 
normal  
Applied  
Accepted  
   
Geoff Clare
The Open Group
many
many
many
---
Remove some of the OB shaded text
As part of the preparations for Issue 8 Draft 1 we need to decide which obsolescent features to remove and which to keep. The proposal in the desired action assumes that we want to remove everything that isn't in the current C2x draft standard. Note that asctime_r() and ctime_r() are not in C17 but they are in the current C2x draft. If we can persuade the C committee not to add them, then we can remove them from Issue 8.

We don't need to spell out the changes in detail with page and line numbers; the high level instructions in the desired action are sufficient.
Keep the following functions:

asctime()
asctime_r()
ctime()
ctime_r()
tmpnam()

and in their FUTURE DIRECTIONS sections after the statement that they may be removed in a future version, add "but not until after they have been removed from the ISO C standard".

Remove everything that is part of the STREAMS, Tracing, Batch Environment Services and Utilities, and FORTRAN Development options, including XSH 2.6 STREAMS, XSH 2.11 Tracing, XCU 3 Batch Environment Services, and XRAT B.2.6, B.2.11, and C.3.

Remove the following functions:

_longjmp()
_setjmp()
_tolower()
_toupper()
ftw()
getitimer()
gets()
gettimeofday()
isascii()
pthread_getconcurrency()
pthread_setconcurrency()
rand_r()
setitimer()
setpgrp()
sighold()
sigignore()
siginterrupt()
sigpause()
sigrelse()
sigset()
tempnam()
toascii()
ulimit()
utime()

Remove the following headers entirely:

<ulimit.h>
<utime.h>

Remove the following from the indicated headers:

*_V6_* from <unistd.h> [also make *_V7_* obsolescent and add *_V8_*]
MAXFLOAT from <math.h>
P_tmpdir from <stdio.h>
SIG_HOLD and SIGPROF from <signal.h>
isw*(), towlower(), towupper(), and wctype() from <wchar.h>
itimerval, ITIMER_REAL, ITIMER_VIRTUAL, ITIMER_PROF from <sys/time.h>
pthread_atfork() and ctermid() ("may includes") from <unistd.h>
wctype_t from <wchar.h> [move the description to <wctype.h>]

Remove the following from the indicated function descriptions:

*_V6_* from confstr() [also make *_V7_* obsolescent and add *_V8_*]
*_V6_* from sysconf() [also make *_V7_* obsolescent and add *_V8_*]

Remove the following from the indicated utility descriptions:

-l trace and libtrace.a from c99 [before converting to a c17 or c2x page]
maketemp from m4
"expression1 -a expression2", "expression1 -o expression2", and "( expression )" from test

For each removed interface, remove all references to it, e.g. from notes in FUTURE DIRECTIONS section saying they may be removed, from the async-signal-safe list, the cancellation point lists, subprofiling groups, SEE ALSO entries, and pointer pages.
Notes
(0004809)
shware_systems   
2020-04-01 14:09   
Where interfaces were required for XSI conformance, but were optional groups
for POSIX conformance, I think more would be happier if they stay in the standard as those groups, as a backwards compatibility consideration. What would be obsoleted is them being any requirement. It seems the only changes to source would be tests only on XOPEN_VERSION, to establish availability, would have to switch to POSIX_VERSION and individual group testing.
(0004835)
Don Cragun   
2020-04-30 15:46   
If the C committee decides to remove the functions that are being kept, a new bug will need to be filed to remove them from a future draft.




Viewing Issue Advanced Details
1329 [1003.1(2013)/Issue7+TC1] Base Definitions and Headers Objection Clarification Requested 2020-03-29 20:54 2020-03-30 10:27
rhialto
 
normal  
New  
Open  
   
Olaf 'Rhialto' Seibert
Vol 1. 9.4.6, Vol. 1. 13. regex.h, Vol 2. regcomp(), Vol. 4. A.9.2
190
6195
---
Problem in resolution of 0000793: "Regular Expressions: add REG_MINIMAL and a minimum repitition modifier"
Issue 793 (which has status Applied) contains this text to apply:

     - Vol. 1: Base Definitions, Chapter 9, «Regular Expressions».

  9.4.6 EREs Matching Multiple Characters, p. 190, line 6195:
  insert after

        6. Each of the duplication symbols ('+', '*', '?', and intervals) may
           be suffixed by the minimal repitition modifier '?' <question-mark>,
           in which case matching behaviour is changed from the «leftmost
           longest possible match» to the «leftmost shortest possible match»,
           including the null match
           (see [reference to A.9, p. 3500 ff.]). For example, the ERE ".*c"
           matches the last character ('c') in the string "abc abc", whereas
           the ERE ".*?c" matches the first character 'c', the third character
           in the string.

There seems to be a mistake in this text:
                                                                                
     For example, the ERE ".*c" matches the last character ('c') in the
     string "abc abc",
                                                                                
which should likely be
                                                          
     For example, the ERE ".*c" matches all characters up to the last
     character ('c') in the string "abc abc",
                               
As an existing implementation, NetBSD's egrep matches the whole string:

    $ echo abc-abc | egrep --color ".*c"
    abc-abc
                                              
with the whole of abc-abc coloured red.

           
                                                                     
The text that follows seems incorrect for a similar but potentially more serious reason:
                                                                                
    whereas the ERE ".*?c" matches the first character 'c', the third
    character in the string. [[ "abc abc" ]]
                                                                                
Possibly it should match "abc", because that is leftmost; just matching
"c" would be shortest but I don't see it as being leftmost. Which of these two is meant?



Furthermore, "repitition" (used several times) seems to contain a typo.
Correct the examples, and/or indicate whether "leftmost" or "shortest" is more important in a match with a minimal repetition modifier.
Notes
(0004804)
kre   
2020-03-30 02:57   
The major issue here is whether in "leftmost shortest possible match" which
takes priority, "leftmost" or "shortest possible" when the results would be
different (as in the example given) - the text of the example suggests that
the shortest match should be preferred over leftmost, but I suspect that's
probably just a similar example error than the obvious earlier one.

In case of "leftmost longest possible match" this issue cannot arise, as
moving the match further to the left necessarily makes it longer,

It is likely that the text from the definition of "matched" in 9.1:

    The search for a matching sequence starts at the beginning of a string
    and stops when the first sequence matching the expression is found,
    where ``first'' is defined to mean ``begins earliest in the string''.

will cover this case, but that's only if this is what the implementations
that currently implement minimal matches implement it that way, and if the
example is corrected to make it clear that is what happens. Additional
words to make it clear that when there is an apparent conflict, the leftmost
possible match is chosen rather than a shorter one beginning further to the
right might be useful.
(0004805)
kre   
2020-03-30 03:14   
(edited on: 2020-03-30 05:59)
0000073 should probably also have amended the text in 9.1 that immediately
follows the quote given in Note: 0004804 ...

     If the pattern permits a variable number of matching characters and
     thus there is more than one such sequence starting at that point, the
     longest such sequence is matched.

Since that will no longer be universally true.

Further, when dealing with sub-matches, we need to know how the "minimal
match" operator applies to submatches, both when those sub-matches do, and
do not, also contain the minimal match operator.

That is, in each of

     (([abc]*)([abc]*)([abc]*))+?x
     (([abc]*?)([abc]*)([abc]*))+?x
     (([abc]*)([abc]*?)([abc]*))+?x
     (([abc]*?)([abc]*?)([abc]*))+?x

when applied to the string

     abcaabbccaaabbbcccx

what is \1 \2 \3 and \4 in each case ?

That will partly turn upon whether the added text

        If the REG_MINIMAL flag, as defined in the <regex.h>[REF] header,
           is used when compiling an ERE via regcomp(3)[REF], the «leftmost
           shortest possible match» is the default, and the minimal repitition
           modifier ’?’ can be used to select the «leftmost longest possible
           match».

(added in the new s.6 on p 190 of 7-TC1) is intended to apply to the
? operator as well as the REG_MINIMAL flag. That is, when evaluating
an ERE for which the "minimal match" operator applies, do sub-expressions
inherit the mimimal match property, or not? And if they do, does the
minimal match operator do what it does when REG_MINIMAL is used, and
reverse its effect, to become a longest match operator?

I don't know the answers to any of this, but I do know it all needs to be
made clear. There might be more, RE's in general tend to be things I simply
use, I generally prefer not to try and bend my mind around their definitions.

(0004806)
geoffclare   
2020-03-30 08:36   
When I applied bug 0000793 I spotted that "up to" was missing and inserted it. I should have added a note to the bug that I had done this - sorry.

The source currently has:
For example, the ERE
.sG ".*c"
matches up to the last character (\c
.cH c )
in the string
.sG "abc abc" ,
whereas the ERE
.sG ".*?c"
matches up to the first character
.cH c ,
the third character in the string.

I also fixed the spelling of "repetition" (and would not have added a note for that - minor editorial corrections like that are often needed when applying bugs).

Re: Note: 0004805 I agree that a change to the definition of "matched" in 9.1 is needed, and we should address that problem here.
(0004808)
kre   
2020-03-30 10:27   
I'd also like to see a better definition of what all of this really
means, particularly REG_MINIMAL and the (new, as opposed to old) '?'
operator when that is in use, and how all of this applies in some of
the more difficult cases. I know I couldn't implement this in a
way I would expect to be compatible with other implementations based
only on what has been added in 0000793 (even assuming that we fix
the definition of "matched").




Viewing Issue Advanced Details
1328 [1003.1(2016)/Issue7+TC2] System Interfaces Objection Omission 2020-02-24 19:13 2020-06-11 11:19
eblake
 
normal  
Applied  
Accepted  
   
Eric Blake
Red Hat
posix_spawn.restrict
XSH posix_spawn
1452
48197
Approved
See Note: 0004836
posix_spawn lacks use of restrict
It looks like POSIX tries to uniformly apply restrict to any pointer (or potential pointer) argument of a function that takes multiple pointers that involve char*, void*, or identical types, where overlap between those pointers is not expected. However, on posix_spawn, this approach was not completely applied, and the file_actions parameter lacks a useful restrict designation.
At page 341 line 11585, and page 342 line 11620 (<spawn.h> posix_spawn and posix_spawnp), change:
const posix_spawn_file_actions_t *,
to
const posix_spawn_file_actions_t *restrict,


At page 1451 lines 48197 and 48201, and page 1489 line 49169 (posix_spawnp), change:
const posix_spawn_file_actions_t *file_actions,
to
const posix_spawn_file_actions_t *restrict file_actions,

Notes
(0004786)
eblake   
2020-02-24 19:17   
Original report by Bruno Haible here:
https://lists.gnu.org/archive/html/bug-gnulib/2020-02/msg00121.html [^]
(0004836)
nick   
2020-04-30 16:02   
Interpretation response
------------------------
The standard states the requirements, and conforming implementations must conform to this. However, concerns have been raised about this which are being referred to the sponsor.

Rationale:
-------------
None

Notes to the Editor (not part of this interpretation):
-------------------------------------------------------
See Desired Action
(0004843)
ajosey   
2020-04-30 16:39   
Interpretation proposed: April 30 2020
(0004883)
ajosey   
2020-06-02 09:37   
Approved: 2nd June 2020




Viewing Issue Advanced Details
1327 [1003.1(2016)/Issue7+TC2] Base Definitions and Headers Objection Error 2020-02-24 04:42 2020-06-04 08:39
dannyniu
 
normal  
Applied  
Accepted  
   
DannyNiu/NJF
<netdb.h>
303
10273-10274
---
flags should be ai_flags.
The sentence immediately following the list of members to the addrinfo structure mistakenly says:

for use in *flags* field of the addrinfo structure,

However, there's no *flags* member, but instead a *ai_flags* member.
Change:

The <netdb.h> header shall define the following symbolic constants that evaluate to bitwise-distinct integer constants for use in the *flags* field of the **addrinfo** structure

to

The <netdb.h> header shall define the following symbolic constants that evaluate to bitwise-distinct integer constants for use in the *ai_flags* field of the **addrinfo** structure
There are no notes attached to this issue.




Viewing Issue Advanced Details
1326 [Online Pubs] Base Definitions Editorial Error 2020-02-24 04:10 2020-04-30 16:26
dannyniu
 
normal  
Applied  
Accepted  
   
DannyNiu/NJF
basedefs/sys_msg.h.html
<sys/msg.h>
Superfluous punctuations
In the descriptions for the msqid_ds structure, there's several instances of "()." occuring for each members of the structure. These are probably errors in Troff source code or macro packages.
Change


pid_t msg_lspid Process ID of last msgsnd
 ().
pid_t msg_lrpid Process ID of last msgrcv
 ().
time_t msg_stime Time of last msgsnd
 ().
time_t msg_rtime Time of last msgrcv
 ().


to


pid_t msg_lspid Process ID of last msgsnd.
pid_t msg_lrpid Process ID of last msgrcv.
time_t msg_stime Time of last msgsnd.
time_t msg_rtime Time of last msgrcv.
Notes
(0004787)
eblake   
2020-02-24 23:00   
The pdf is formatted correctly (where it shows

<tt>pid_t msg_lspid</tt> Process ID of last msgsnd( ).

Somehow, the html rendition is placing the '().' as a new row in column 1 instead of keeping everything in column 3 on a single row.
(0004838)
ajosey   
2020-04-30 16:15   
The html edition has been updated.
(0004841)
ajosey   
2020-04-30 16:26   
The html download bundles have also been updated




Viewing Issue Advanced Details
1325 [Online Pubs] Shell and Utilities Editorial Clarification Requested 2020-02-09 17:17 2020-02-10 13:27
dmitry_goncharov
 
normal  
New  
Open  
   
Dmitry Goncharov
(section number or name, can be interface name)
Allow make to remake an included file
https://austingroupbugs.net/view.php?id=333 [^]

proposes the following wording
 
"If the file cannot be opened, and if the word include was prefixed with a
<hyphen> character, the file shall be ignored. Otherwise, if the file cannot be
opened an error occurs."

If this wording is added to the standard it'll outlaw a widely used scenario.
Specifically a makefile has a rule to build a dependency file and an include
directive without hyphen to include this freshly built dependency file.
If the file is missing the rule builds it and include includes it.
include rather than -include is more useful for this scenario because if the
file cannot be remade the user will know.
This mechanism is supported by gnu make and is used by automake and thus by a
miriad projects that use autotools.
After
"If the word include"
append
", optionally prefixed with a <hyphen> character,".

At the end, append:
"If the file cannot be opened and the makefile has a rule to build the file
make may remake the file.
If the file cannot be opened and cannot be remade, and if the word include was
prefixed with a <hyphen> character, this -include directive shall be ignored.
If the file cannot be opened and cannot be remade, and if the word include was
not prefixed with a <hyphen> character, an error occurs."
Notes
(0004780)
shware_systems   
2020-02-09 18:29   
As a potential breaking change to some implementations I'm not sure this is desirable. If such a workflow is needed the same effect is produced by having the file as a target of its own before the include line, with the current wording. Having include fail indicates the commands associated with an explicit target were not successful in creating it. I can see an App Usage note explaining this, not modifying the include argument to be treated fully as a prerequisite file; matching targets and inference rules, especially if the file suffix matches additional inference rules built-in to a particular implementation. The gnu make behavior would still be allowed to makefile's that don't have the .POSIX special target.
(0004781)
joerg   
2020-02-10 13:27   
(edited on: 2020-02-10 13:28)
Re: Note: 0004780 Nobody will implement a behavior since SunPro Make introduced the behavior of first trying to evaluate an existing rule before the include statement is handled.

Prepending "include" by a minus thus just can be seen as "do not cause an error if the file cannot be included after a potential rule to create it has been run".

A big problem in this area however is that GNU make did missbehave for this statement for a really long time. The new gmake version that appeared recently fixed this in serial mode, but fails even more miserably in parallel mode.

But it may be of interest that SunPro Make see schilytools: http://sourceforge.net/projects/schilytools/files/ [^]

supports "include" since SunOS-3.5 (January 1986) and my enhanced version supports "-include" since January 2018, the way I documented above.

My smake (first written by me in 1985) supports include and -include since 24 years.

I cannot speak for GNU make, but SunPro Make passed the POSIX compliance tests together with the Solaris certification. It would be a really bad idea to make the standard in conflict with the current behavior of the most important make implementations.

BTW: I was in the assumption, that the new POSIX text is compatible with the SunPro Make implementation.





Viewing Issue Advanced Details
1324 [1003.1(2016)/Issue7+TC2] System Interfaces Objection Clarification Requested 2020-02-07 15:57 2020-02-07 15:57
elahav
 
normal  
New  
Open  
   
Elad Lahav
QNX Software Systems
sem_open()
(page or range of pages)
(Line or range of lines)
---
sem_open() should not require the same name to map to the same virtual address
The current specification mandates that two calls to sem_open() with the same name made by the same process return the same virtual address, so long as no process called sem_unlink() in between the two calls.
I believe that this is an unreasonable requirement, for the following reasons:
1. There is no dependency by any other sem_*() function on this requirement. So long as the two sem_t pointers returned by the calls refer to the same underlying semaphore all sem_*() functions will behave correctly when passed these pointers.
2. It puts an unnecessary burden on the system to track virtual address usage by the calling process. The system should only need to track the association of any given sem_t pointer to the underlying object. If, for example, the sem_t pointer holds a file descriptor to an open semaphore, then the system only needs to track the file descriptor.
3. Since sem_close() is documented as releasing all resources for the semaphore and making the pointer invalid for future use, the requirement promotes an unsafe "open twice, close once" paradigm.
4. The requirement deviates from the standard approach to resource allocation, where multiple calls provide different handles, even if those handles refer to the same object (e.g., open(), shm_open(), mmap() with the same file descriptor and offset)
5. The requirement may conflict with the following future direction: "A future version might require the sem_open() and sem_unlink() functions to have semantics similar to normal file system operations."
Make the requirement optional
There are no notes attached to this issue.




Viewing Issue Advanced Details
1323 [1003.1(2016)/Issue7+TC2] Base Definitions and Headers Objection Clarification Requested 2020-02-07 10:10 2020-02-07 18:22
geoffclare
 
normal  
New  
Open  
   
Geoff Clare
The Open Group
<sys/stat.h>
392
13324
---
st_nlink is the number of hard links in the file system
The description of st_nlink is:
Number of hard links to the file.
The standard should make clear that st_nlink is the number of hard links in the file system, which can differ from the number in the entire file hierarchy on systems that support mounting the same file system multiple times at different mount points.
On page 392 line 13342 section <sys/stat.h> add a new paragraph:
The st_nlink value shall be the number of hard links to the file within the file system in which the file resides.
Note: On some implementations a file system can be made to appear multiple times at different mount points in the file hierarchy, in which case the number of hard links to the file throughout the file hierarchy is st_nlink times the number of mount points for that file system.

Notes
(0004774)
dannyniu   
2020-02-07 11:55   
(edited on: 2020-02-07 11:56)
I suggest make the note indicate st_nlink as undefined, as $(st_nlink * count(mount points)) is not the actual behavior as I've observed on FreeBSD with nullfs:

$ mount -l
/dev/ada0s1a on / (ufs, local, journaled soft-updates)
devfs on /dev (devfs, local, multilabel)
/usr/home/dannyniu/sandbox on /usr/home/dannyniu/glassbox (nullfs, local)

$ ls -l sandbox glassbox
glassbox:
total 4
-rwxr-xr-x 1 dannyniu dannyniu 246 Dec 10 10:24 data2url.php

sandbox:
total 4
-rwxr-xr-x 1 dannyniu dannyniu 246 Dec 10 10:24 data2url.php
$

(0004775)
dannyniu   
2020-02-07 12:27   
Okay, I might've mis-interpreted the "note", but nonetheless I suggest make it clearer by saying:

Note: ... in which case the actual number of hard links to the file throughout the file hierarchy is st_nlink times the number of mount points for that file system (which is greater than st_nlink).
(0004776)
stephane   
2020-02-07 13:24   
I'd rather say "in which case the number of hard links to the file throughout the file hierarchy could be greater than st_nlink" as systems allow mounting *parts* of a FS (like in bind-mounts) and part of a FS can be masked by another FS (like FS2 is mounted on a non-empty directory of FS1), in which case, it could even be less than st_nlink.
(0004777)
geoffclare   
2020-02-07 14:27   
Seems I was fixated on the multiple-mount-points case and hadn't thought about other situations affecting the number of links that can be "found". The less-than-st_nlink case could happen even on ancient systems that don't have modern functionality like bind mounts, etc. You just need to mount another file system over the top of a non-empty directory. I'll add a note with a revised proposal.
(0004778)
geoffclare   
2020-02-07 14:41   
On page 392 line 13342 section <sys/stat.h> add a new paragraph:
The st_nlink value shall be the number of hard links to the file within the file system in which the file resides.
Note: The number of links to the file that can be found by traversing the file hierarchy can differ from st_nlink. For example, it can be less than st_nlink if a link to the file cannot be reached because it is below a directory that has been overlaid with a mount point for a different file system, and it can be greater than st_nlink on implementations that allow a file system (or part of one) to be duplicated at additional mount points.
(0004779)
shware_systems   
2020-02-07 18:22   
I agree with the clarification here, just feel what is in the Note is better as an App Usage paragraph of its own, rather than in the normative part.




Viewing Issue Advanced Details
1322 [Online Pubs] Shell and Utilities Editorial Error 2020-02-05 05:25 2020-04-30 16:26
andras_farkas
 
normal  
Applied  
Accepted  
   
Andras Farkas
ls
Bizarre HTML rendering for an -o option on ls page
On
https://pubs.opengroup.org/onlinepubs/9699919799/utilities/ls.html [^]
beneath the list of all options, we have:
If an option that enables long format output ( [XSI] [Option Start] -g, [Option End] -l (ell), -n, and [XSI] \*(z!-o\*(z?) is given with an option that disables long format output (-C, -m, and -x), this shall not be considered an error. The last of these options specified shall determine whether long format output is written.
The following text seems wrong, as if something went wrong when the source of the standard was converted to HTML:
[XSI] \*(z!-o\*(z?
Change
[XSI] \*(z!-o\*(z?
to
[XSI] -o
with appropriate opt-start.gif and opt-end.gif markers.
Notes
(0004770)
geoffclare   
2020-02-05 09:28   
(edited on: 2020-02-05 09:32)
I have moved this to the Online Pubs project since there is no problem with the PDF here; it is an HTML translation issue.

The \*(z! and \*(z? around the -o in the troff source should have been converted to the option start and end markers, like they were in the SYNOPSIS. (Alternatively the troff could perhaps be changed to add shading to -o here the same way as done for -g, but there may be a reason it is done differently.)

(0004839)
ajosey   
2020-04-30 16:19   
The html edition is updated
(0004840)
ajosey   
2020-04-30 16:26   
The html download bundles have also been updated




Viewing Issue Advanced Details
1321 [1003.1(2016)/Issue7+TC2] Shell and Utilities Objection Error 2020-01-29 15:36 2020-05-19 11:05
geoffclare
 
normal  
Applied  
Accepted  
   
Geoff Clare
The Open Group
false
2778
91221
---
exit status for false should be 1-125
The exit status of the false utility is simply required to be non-zero.
This means applications have no way of distinguishing between a normal
exit of this utility and the special exit statuses, greater than or
equal to 126, used for "not found", "can't be executed" or "terminated
by a signal" conditions.
Change:
The false utility shall always exit with a value other than zero.
to:
The false utility shall always exit with a value between 1 and 125, inclusive.
Notes
(0004745)
shware_systems   
2020-01-29 16:17   
To go with this, should there be a similar limitation on EXIT_FAILURE, as a CX extension, in <stdlib.h>? Then it could be required false return EXIT_FAILURE, as a symbolic rather than arbitrary value.
(0004746)
geoffclare   
2020-01-29 16:40   
The corresponding EXIT_FAILURE change is in bug 0001229.
(0004747)
shware_systems   
2020-01-29 17:06   
(edited on: 2020-01-29 17:35)
So it is, my bad for forgetting. I'd then prefer the change to be:
  The false utility shall exit with the return value equivalent to EXIT_FAILURE in <stdlib.h>.

(0004748)
kre   
2020-01-29 18:58   
I think this is perhaps not really needed. All that matters for false is
that its exit status not be 0, that's why it is used. Why it isn't 0 is
immaterial, as is which value it uses.

If we happen to get a "command not found" from some shell where false isn't
built in, and the PATH doesn't include it, so what? Whatever script is being
run still runs the same, just perhaps with an extra "command not found"
diagnostic thrown in. If false does exit(127) which looks just the same,
but without the diagnostic, does it matter?

Is there any known script that actually checks the exit status after invoking
false, to determine if there was some exec failure of one kind of another?

If not, perhaps this is simply make work?
(0004749)
stephane   
2020-01-29 19:50   
Re: Note: 0004748

> Is there any known script that actually checks the exit status after invoking
false, to determine if there was some exec failure of one kind of another?

We sometime see wrapper scripts like:

#! /bin/sh -

# some setup code
# ...

# run command passed as arguments:
"$@"; ret=$?

# some cleanup code

if [ "$ret" -gt 128 ] # only way to tell the command may have been killed
                      # with the current API
then
  sig=$(kill -l "$ret")

  trap : "$sig"   # in case the shell is zsh

  trap - "$sig"   # restore default disposition, unfortunately doesn't work
                  # if the signal was ignored on start with POSIX
                  # compliant shells. In zsh, can be worked around by setting
                  # a handler first, hence the previous line.

  kill -s "$sig" "$$" # report the same death by signal to caller if possible
fi
exit "$ret" # also fallback if kill failed to kill us above


More generally, one should avoid to exit with codes above 127 except to report some death by signal.
(0004750)
kre   
2020-01-29 20:27   
I agree in general with what is in Note: 0004749 but do remember that the
subject here is false, not commands in general.

When was the last time you wrote code to test if false was killed by a
signal? When was the last time you ever saw false killed by a signal?

We all know that what false actually does is (nominally) exit(1), and in
reality simply sets the interval variable that becomes $? to 1 - but I'm
not sure there is any particularly good reason for the standard to require that.
(0004751)
stephane   
2020-01-29 21:12   
Re: Note: 0004750

> When was the last time you wrote code to test if false was killed by a
> signal? When was the last time you ever saw false killed by a signal?

You'll note in my example, the command was "$@", so any arbitrary command as invoked by the user.

I often use true or false at the end of my scripts or functions for them to report a true or false exit status. I wouldn't want that false invocation to report "command not found" or "killed by SIGINT".
(0004752)
stephane   
2020-01-29 21:17   
And no, I've never come across a "false" implementation that reported anything other than "1" when terminating normally and I wouldn't object to POSIX requiring it. Something like:

Exits with code 1, and >1 when an error occurred (bad option, stack overflow detection, killed by signal...)
(0004753)
shware_systems   
2020-01-29 22:23   
(edited on: 2020-01-29 22:32)
Re: 4748

To me it does matter as a portability consideration. As written a simplistic implementation of false could be:
#include <stdlib.h>
main() { return !EXIT_SUCCESS; }

as meeting the requirements, but when masked to 8 bits is the value 255. Requiring { return EXIT_FAILURE; } homogenizes $? with what the standard already provides to indicate non-true as an exit code.

(0004754)
Don Cragun   
2020-01-29 22:39   
Re Note: 0004753:

Sorry Mark, but no. The ! operator in C is a logical operator returning 0 or 1. It is not a bitwise operator. According to the C Standard:
The result of the logical negation operator ! is 0 if the value of its operand compares unequal to 0, 1 if the value of its operand compares equal to 0. The result has type int. The expression !E is equivalent to (0==E).
(0004755)
shware_systems   
2020-01-29 22:46   
(edited on: 2020-01-29 22:47)
It was off top of head, sorry. Make it ~EXIT_SUCCESS for the bitwise one, as intended, and I believe argument holds.

(0004756)
stephane   
2020-01-30 08:17   
Re: Note: 0004755

that's not what exit status are. There's not just booleans. Values have or can have meanings. Again, values above 128 should be avoided as they can be interpreted as death by signal by shell scripts; 255 is a special case for xargs (causes xargs cmd to abort when one run of cmd exits with that code).

Also, exiting with values above 255 yields undefined/unspecified results in many things. For instance, I'd expect most xargs implementations still use waidpid(), so that exit(-1) would be taken as exit(255) by them, but that would no longer be the case when they switch to waitid() on those system that preserve the full integer passed to exit() (as POSIX now requires, but not all systems do yet).

expr, test, grep are examples of utilities that return boolean meanings (false/true). They're all required to report "false" as 1 (and 2 or above for errors). It would make a lot of sense if that was the case for the "false" utility as well.
(0004757)
shware_systems   
2020-01-30 08:51   
I agree, but with the current wording ~0 as masked by exit() is a non-zero value that would have to be considered as conforming, even though it could also be a signo 127 too. The point was that some change is warranted, as the de facto intent is "any non-zero unsigned char value not reserved for other purposes by the shell", and EXIT_FAILURE is required to be such a value so may as well be reused.
(0004758)
joerg   
2020-01-30 10:27   
(edited on: 2020-01-30 10:29)
I am not sure whether this is known, but /usr/bin/false from Solaris uses the exit code 255.

BTW: exit() does not mask the exit code. The only place that masks the exit code is wait(), when you decide to use the outdated wait() interface. But then the exit code is masked in wait() and not in exit(), since waitid() returns the full int.

call this:

main()
{
   exit(~0);
}

and with "echo $/ $?", you get:

-1 255

if you use a shell that already supports the $/ proposal from Don Cragun, that can be tested on bosh.

exit(256) is a problem , as "echo $/ $?" prints:

256 0

(0004759)
joerg   
2020-01-30 10:37   
(edited on: 2020-01-30 10:46)
Re: Note: 0004749

sleep 10
^C
# echo $/ $?
INT 130

There is a way to distinguish if you use a shell that listens to what we discuss here ;-)


joerg> bla
bla: nicht gefunden
joerg> echo $/ $?
NOTFOUND 127

joerg> ./
./: Ausführen nicht möglich
joerg> echo $/ $?
NOEXEC 126


See Bugs: 0001026 and 0000947

(0004760)
kre   
2020-01-30 11:45   
One last note:

If there was some plan to prescribe that all portable applications must
exit with a status between 0 (indicating success) and 125 (those other 125
values indicating various possible failure modes, as defined for each
application) I would understand the motivation behind the change suggested
here.

I also expect that there would be extreme pushback against any such plan.

Without that, I cannot see how false should be held to a standard that is
not elsewhere required - the arguments for it all relate much more to other
applications that false, which is, in practice, never not found (or not
executable), never interrupted by a signal, and has zero error conditions.

That is the standard false from XCU section 4. (Things like /bin/false
or whatever, might be different, but those are not what this would apply to).

In this way false is different from egrep/test/expr/... in that those have
a need to report more than one "failed" condition (for egrep, pattern not
found, or syntax error in pattern. or failed to open a named file). False
doesn't need any of that.

A non-zero (or if you insist, a non-zero when only the low 8 bits are
considered) exit code is all that matters for it.

But my original point was that this looked like make-work ... there's nothing
inherently wrong (whatever Solaris implementations do) with requiring false
to exit with status 1. There's just no need for it. In the past day or
so we have already expended far more time and effort on this than can
possibly be justified by any result that could be achieved.
(0004761)
stephane   
2020-01-30 15:35   
Re: Note: 0004758
> I am not sure whether this is known, but /usr/bin/false from Solaris uses the exit code 255

Indeed, I hadn't checked standalone false utilities except GNU false (which processes options as allowed by POSIX, to check what happened if you passed an unknown option).

And indeed on Solaris:

$ echo a b | xargs -n1 csh -c 'echo $1:q; false'
a
xargs: Command could not continue processing data


While on all other OSes I've tried (GNU, NetBSD, FreeBSD):

$ echo a b | xargs -n1 csh -c 'echo $1:q; false'
a
b


I guess I should stop using "false" in portable scripts to report a false" status, or redefine it as:

false() { return 1; }


To avoid that potential issue if we can't guarantee false will return with a non-problematic exit code.




Viewing Issue Advanced Details
1320 [1003.1(2013)/Issue7+TC1] Shell and Utilities Editorial Error 2020-01-26 07:50 2020-01-26 07:50
stephane
 
normal  
New  
Open  
   
Stephane Chazelas
awk utility
---
/\n/ can match newline
There is a very bizarre/confused text in the awk specification:

> Except for the '~' and "!~" operators, and in the gsub,
> match, split, and sub built-in functions, ERE matching
> shall be based on input records; that is, record separator
> characters (the first character of the value of the
> variable RS, <newline> by default) cannot be embedded in
> the expression, and no expression shall match the record
> separator character. If the record separator is not
> <newline>, <newline> characters embedded in the expression
> can be matched. For the '~' and "!~" operators, and in
> those four built-in functions, ERE matching shall be based
> on text strings; that is, any character (including
> <newline> and the record separator) can be embedded in the
> pattern, and an appropriate pattern shall match any
> character.

It kind of implies that:

echo x | awk -F'\n' '{$0 = "a\nb"; print /\n/; print $1}'

should print

0
a
b

or possibly

0
x

because /ERE/ or FS cannot match on the record separator and should match on the input record.

That's not what awk implementations do.


RE matching in those cases is not done on input records but on $0. The fact that $0 (in statements other than BEGIN) is initialised from the value of the current input record (which *at that point* didn't contain the then current value of RS) is irrelevant to describe how RE matching is done. RE matching behaviour is totally independent of the value of RS. RS is only used at the time a record is read.
Replace that whole section with something along the lines of:

If the subject is not specified (like in ~, !~, match()...), regexps are matched against the current value of $0.


Also, whether awk can deal with non-text data (NUL, byte values that don't form valid characters, strings longer than LINE_MAX) should probably be moved to some more generic section not specific to RE matching.
There are no notes attached to this issue.




Viewing Issue Advanced Details
1319 [1003.1(2016)/Issue7+TC2] Shell and Utilities Editorial Clarification Requested 2020-01-17 16:42 2020-01-20 10:13
quinq
 
normal  
New  
Open  
   
Quentin Rameau
sed
3219
107979
---
Specify when to print text for a and r commands after d and D commands
According to current specifications, an a or r command should never print its content when a d or D command in the script, as it should normally be printed “when reaching the end of the script”, and d or D command always “start the next cycle” without reaching the end of the script.

This does not looks like what is actually intended, and doesn't reflect what some other implementation do.

The a and r text should be printed before begining the next cycle.
Modify text starting at line 107977 from:

The text specified for the a command, and the contents of the file specified for the r command, shall be written to standard output just before the next attempt to fetch a line of input when executing the N or n commands, or when reaching the end of the script.

to:

The text specified for the a command, and the contents of the file specified for the r command, shall be written to standard output just before the next attempt to fetch a line of input when executing the N or n or D or d commands, or when reaching the end of the script.
Notes
(0004740)
geoffclare   
2020-01-20 10:13   
(edited on: 2020-03-19 16:11)
The original POSIX.2-1992 text for the sed "a" command was:
Write text to standard output just before each attempt to fetch a line of input, whether by executing the N command or by beginning a new cycle.


This changed to the current text in the .2b amendment. It appears that in fixing whatever problem .2b fixed, it also broke things by dropping the reference to beginning a new cycle.





Viewing Issue Advanced Details
1318 [1003.1(2016)/Issue7+TC2] System Interfaces Comment Enhancement Request 2020-01-12 10:50 2020-05-13 15:31
nate_karstens
 
normal  
Applied  
Accepted As Marked  
   
Nate Karstens
Garmin
fcntl, open, socket
Unknown
Unknown
---
Note: 0004797
Define close-on-fork flag
Certain interfaces (like system(), popen(), etc.) are non-atomic in that their implementation first calls a fork() and then an exec(). This creates a race condition in certain scenarios. Please see www.mail-archive.com/austin-group-l@opengroup.org/msg05324.html">https://www.mail-archive.com/austin-group-l@opengroup.org/msg05324.html [www.mail-archive.com/austin-group-l@opengroup.org/msg05324.html" target="_blank">^] and resulting discussion for a description of one such condition.

Issue 1317 already requests enhancements to these interfaces, but this particular issue would also be solvable if there was a close-on-fork flag (similar to close-on-exec, but the file descriptor is closed in the child process after a fork).
Add the following to fcntl()/F_DUPFD:

The FD_CLOFORK flag associated with the new file descriptor shall be cleared to keep the file open in the child process after a fork.

Add the following to fcntl()/F_SETFD

If the FD_CLOFORK flag in the third argument is 0, the file descriptor shall remain open in the child process after a fork(). Otherwise, the file descriptor shall be closed in the child process after a fork().

Add the following to fcntl():

F_DUPFD_CLOFORK
    Like F_DUPFD, but the FD_CLOFORK flag associated with the new file descriptor shall be set.

Additional changes to the RETURN VALUE and ERRORS sections may be necessary as well.

Add the following to open():

O_CLOFORK
    If set, the FD_CLOFORK flag for the new file descriptor shall be set.

POSIX does not currently specify SOCK_CLOEXEC, but this would be a useful addition. Add the following to socket():

SOCK_CLOEXEC
    If set, the close-on-exec (FD_CLOEXEC) flag for the new file descriptor shall be set.
SOCK_CLOFORK
    If set, the close-on-fork (FD_CLOFORK) flag for the new file descriptor shall be set.

In hindsight, it seems like it would have been preferable to have the default behavior be to close all file descriptors when the process forks, and have flags to override that behavior on an individual basis. Submitter cannot think of a way to do that and maintain backwards-compatibility, short of defining new system calls, but the idea seems like it would be worth considering.
Notes
(0004725)
kre   
2020-01-13 07:28   
The wording:

   If the FD_CLOFORK flag in the third argument is 0, the file descriptor
   shall remain open in the child process after a fork(). Otherwise, the
   file descriptor shall be closed in the child process after a fork().

is bizarre, and (I suspect) is influenced by the odd view of how fork()
and file descriptors should have been handled. The more common way to
write this would be

   If the FD_CLOFORK flag in the third argument is set, the file descriptor
   will be closed in the child process after a successful fork() operation,
   otherwise the file descriptor shall remain open in the child after a fork()

with appropriate xref tags added.

Then wrt:

    Add the following to fcntl():

    F_DUPFD_CLOFORK

please, no, we already have F_DUPFD_CLOEXEC which is bad enough, if
we add a new one like this, we'd also need

    F_DUPFD_CLOEXEC_CLOFORK

so that both flags can be set. F_SETFD is enough for all of this, but
to add a new flag to that is problematic, lots of (older) applications
assume that close-on-exec is its only possibility, as for decades that
has been trus (and there was no constant defined, so the result from
F_GETFD is often simply tested against 0, if not 0, close on exec is
assumed, and 1 is used for F_SETFD to set close-on-exec).

If something is needed for atomic operation, add a F_DUFPD variant where
the arg is the flags, rather than the lowest desired fd number, or simply
standardise dup3() which has a flags arg, and could easily be made to have
a "next available bigger than" flag as well as the close-on-exec (and
presumably a new close-on-fork) flag it already accepts (to make dup3()
be able to act as an alternative to fcntl(F_DUFPD) in a reasonable way,
without adding more args to fcntl().

Lastly (for now):

    In hindsight, it seems like it would have been preferable to have the
    default behavior be to close all file descriptors when the process forks

Nonsense. All of this is brought about by the horrid threading design that's
been thrust upon us (and yes, I know it came about based upon implementations
that were entirely in-process hacks initially, with no kernel support at all).

Without threads there's no reason for any of this. With a sane thread
design (which would largely be something like a sfork() call - fork but
share the text (as more or less always these days anyway) and the data
segments - but producing separate processes. All the synchronisation
mechanisms would still be needed, but none of this fd nonsense, as fd's
would only be shared by design, not by accident.

However we have what we have, so I am not objecting to the solution proposed
in principle (just some details) - also not promising that any of it will
ever get implemented in NetBSD (none of this is there now). (This is not
my area there, so someone else, with community input, would make that call.)
(0004728)
eblake   
2020-01-13 16:38   
Standardization of dup3() and SOCK_CLOEXEC is already the subject of 0000411
(0004797)
geoffclare   
2020-03-16 16:24   
(edited on: 2020-03-26 15:30)
On page 238 line 8018 section <fcntl.h>, change:
The <fcntl.h> header shall define the following symbolic constant used for the fcntl() file descriptor flags, which shall be suitable for use in #if preprocessing directives.

FD_CLOEXEC
Close the file descriptor upon execution of an exec family function.
to:
The <fcntl.h> header shall define the following symbolic constants used for the fcntl() file descriptor flags. The values shall be bitwise-distinct and shall be suitable for use in #if preprocessing directives.

FD_CLOEXEC
Close the file descriptor upon successful execution of an exec family function [SPN]and in the new process image created by posix_spawn() or posix_spawnp()[/SPN].


FD_CLOFORK
Close the file descriptor in any child process created from a process that has the file descriptor open; that is, the child shall not inherit the file descriptor.

On page 238 line 8032 section <fcntl.h>, change:
O_CLOEXEC
The FD_CLOEXEC flag associated with the new descriptor shall be set to close the file descriptor upon execution of an exec family function.
to:
O_CLOEXEC
Atomically set the FD_CLOEXEC flag on the new file descriptor.


O_CLOFORK
Atomically set the FD_CLOFORK flag on the new file descriptor.

On page 387 line 13167 section <sys/socket.h>, after the bug 411 text:
SOCK_CLOEXEC
Create a socket file descriptor with the FD_CLOEXEC flag atomically set on that file descriptor.
add:
SOCK_CLOFORK
Create a socket file descriptor with the FD_CLOFORK flag atomically set on that file descriptor.

On page 388 line 13195 section <sys/socket.h>, after the bug 411 text:
MSG_CMSG_CLOEXEC
Atomically set the FD_CLOEXEC flag on any file descriptors created via SCM_RIGHTS during recvmsg().
add:
MSG_CMSG_CLOFORK
Atomically set the FD_CLOFORK flag on any file descriptors created via SCM_RIGHTS during recvmsg().

On page 497 line 17263 section 2.5.1, change:
A file descriptor is closed by close(), _exit(), or the exec functions when FD_CLOEXEC is set on that file descriptor.
to:
Several functions close file descriptors, including close(), dup2(), _exit(), the exec functions when FD_CLOEXEC is set on a file descriptor, fork() when FD_CLOFORK is set on a file descriptor, and posix_spawn() when either FD_CLOEXEC or FD_CLOFORK is set.

On page 568 line 19882 section accept(), after applying bug 411 change:
If O_NONBLOCK is set on the file description for socket, it is unspecified whether O_NONBLOCK will be set on the file description created by accept().

The accept4() function shall be equivalent to the accept() function, except that the O_NONBLOCK flag shall not be set on the new file description if the flag argument is 0. Additionally, the flag argument can be constructed from a bitwise-inclusive OR of flags from the following list:

SOCK_CLOEXEC
Atomically set the FD_CLOEXEC flag on the new file descriptor.

SOCK_NONBLOCK
Set the O_NONBLOCK file status flag on the new file description.
to:
If O_NONBLOCK is set on the file description for socket, it is implementation-defined whether O_NONBLOCK will be set on the file description created by accept(). FD_CLOEXEC and FD_CLOFORK for the new file descriptor shall be clear, regardless of how they are currently set for socket.

The accept4() function shall be equivalent to the accept() function, except that the state of O_NONBLOCK on the new file description, and FD_CLOEXEC and FD_CLOFORK on the returned file descriptor shall be determined solely by the flag argument, which can be constructed from a bitwise-inclusive OR of flags from the following list:

SOCK_CLOEXEC
Atomically set the FD_CLOEXEC flag on the new file descriptor.

SOCK_CLOFORK
Atomically set the FD_CLOFORK flag on the new file descriptor.

SOCK_NONBLOCK
Set the O_NONBLOCK file status flag on the new file description.

On page 569 line 19914 section accept(), after applying bug 411 change:
The SOCK_CLOEXEC flag of accept4() is necessary to avoid a data race in multi-threaded applications. Without it, a file descriptor is leaked into a child process created by one thread in the window between another thread creating a file descriptor with accept() and then using fcntl() to set the FD_CLOEXEC flag.
to:
The SOCK_CLOEXEC and SOCK_CLOFORK flags of accept4() are necessary to avoid a data race in multi-threaded applications. Without SOCK_CLOFORK, a file descriptor is leaked into a child process created by one thread in the window between another thread creating a file descriptor with accept() and then using fcntl() to set the FD_CLOFORK flag. Without SOCK_CLOEXEC, a file descriptor intentionally inherited by child processes is similarly leaked into an executed program if FD_CLOEXEC is not set atomically.

Two designs often used for network servers are multi-threaded servers with a pre-created pool of worker threads, where the thread that accepts the connection request hands over the new file desciptor to a worker thread for servicing, and pre-fork servers with a pre-created pool of worker processes, where the process that accepts the connection request passes the new file desciptor (for example via sendmsg()) to a worker process. In both of these designs, accept4() should be used with the SOCK_CLOFORK flag set. Simpler designs are also sometimes used that do not pre-create a pool. For a multi-threaded server that creates a thread to handle each request, SOCK_CLOFORK should still be used. For a forking server that creates a child to service each request, clearly SOCK_CLOFORK cannot be used if the child is to inherit the file descriptor to be serviced, and therefore this type of server needs to use an alternative method of indicating the end of communications, for example using shutdown(), to ensure the client sees end-of-file, rather than just closing the socket. Such child processes should set FD_CLOFORK on the inherited file descriptor before they attempt to start any additional child proceses to avoid leakage into those children.

On page 714 line 24432 section creat(), after applying bug 411 change:
In multi-threaded applications, the creat() function can leak file descriptors into child processes. Applications should instead use open() with the O_CLOEXEC flag to avoid the leak.
to:
In multi-threaded applications, the creat() function can leak file descriptors into child processes. Applications should instead use open() with the O_CLOEXEC and O_CLOFORK flags to avoid the leak.

On page 752 line 25609 section dup(), change:
Upon successful completion, if fildes is not equal to fildes2, the FD_CLOEXEC flag associated with fildes2 shall be cleared. If fildes is equal to fildes2, the FD_CLOEXEC flag associated with fildes2 shall not be changed.
to:
Upon successful completion, if fildes is not equal to fildes2, the FD_CLOEXEC and FD_CLOFORK flags associated with fildes2 shall be cleared. If fildes is equal to fildes2, the FD_CLOEXEC and FD_CLOFORK flags associated with fildes2 shall not be changed.

On page 752 line 25612 section dup(), after applying bug 411 change:
Additionally, the flag parameter can be set to O_CLOEXEC (from <fcntl.h>) to cause FD_CLOEXEC flag to be set on the new file descriptor.
to:
Additionally, the flag argument can be constructed from a bitwise-inclusive OR of flags (defined in <fcntl.h>) from the following list:

O_CLOEXEC
Atomically set the FD_CLOEXEC flag on fildes2.


O_CLOFORK
Atomically set the FD_CLOFORK flag on fildes2.

On page 753 line 25650 section dup(), change:
In order to avoid a race condition of leaking an unintended file descriptor into a child process, an application should consider opening all file descriptors with the FD_CLOEXEC bit set unless the file descriptor is intended to be inherited across exec.
to:
In order to avoid a race condition of leaking an unintended file descriptor into a child process or executed program, an application should consider opening all file descriptors with the FD_CLOFORK or FD_CLOEXEC flag, or both flags, set unless the file descriptor is intended to be inherited by child processes or executed programs, respectively.

On page 753 line 25664 section dup(), after applying bug 411 change:
The dup3() function with the O_CLOEXEC flag is necessary to avoid a data race in multi-threaded applications. Without it, a file descriptor is leaked into a child process created by one thread in the window between another thread creating a file descriptor with dup2() and then using fcntl() to set the FD_CLOEXEC flag. The safe counterpart for avoiding the same race in dup() is the use of the F_DUP_CLOEXEC action of the fcntl() function.
to:
The dup3() function with the O_CLOEXEC and O_CLOFORK flags is necessary to avoid a data race in multi-threaded applications. Without O_CLOFORK, a file descriptor is leaked into a child process created by one thread in the window between another thread creating a file descriptor with dup2() and then using fcntl() to set the FD_CLOFORK flag. Without O_CLOEXEC, a file descriptor intentionally inherited by child processes is similarly leaked into an executed program if FD_CLOEXEC is not set atomically. The safe counterpart for avoiding the same race with dup() is the use of the F_DUPFD_CLOFORK or F_DUPFD_CLOEXEC action of the fcntl() function.

On page 784 line 26576 section exec, change:
For those file descriptors that remain open, all attributes of the open file description remain unchanged.
to:
For those file descriptors that remain open, all attributes of the open file description shall remain unchanged and the FD_CLOFORK file descriptor flag, if set, shall remain set.

On page 820 line 27760 section fcntl(), change:
The FD_CLOEXEC flag associated with the new file descriptor shall be cleared to keep the file open across calls to one of the exec functions.
to:
The FD_CLOEXEC and FD_CLOFORK flags associated with the new file descriptor shall be cleared.

On page 820 line 27765 section fcntl(), add:
F_DUPFD_CLOFORK
Like F_DUPFD, but the FD_CLOFORK flag associated with the new file descriptor shall be set.

On page 820 line 27771 section fcntl(), change:
If the FD_CLOEXEC flag in the third argument is 0, the file descriptor shall remain open across the exec functions; otherwise, the file descriptor shall be closed upon successful execution of one of the exec functions.
to:
If the FD_CLOEXEC flag in the third argument is set, the file descriptor shall be closed upon successful execution of an exec family function [SPN]and in the new process image created by posix_spawn() or posix_spawnp()[/SPN]; otherwise, the file descriptor shall remain open. If the FD_CLOFORK flag in the third argument is set, the file descriptor shall not be inherited by any child process created from a process that has the file descriptor open; otherwise, the file descriptor shall be inherited.

On page 823 line 27898 section fcntl(), add to RETURN VALUE:
F_DUPFD_CLOFORK
A new file descriptor.

On page 823 line 27923, 27928 section fcntl(), change:
F_DUPFD or F_DUPFD_CLOEXEC
to:
F_DUPFD, F_DUPFD_CLOEXEC, or F_DUPFD_CLOFORK

On page 825 line 28010 section fcntl(), add to APPLICATION USAGE:
In order to set both FD_CLOEXEC and FD_CLOFORK when duplicating a file descriptor, applications should use F_DUPFD_CLOFORK to obtain the new file descriptor with FD_CLOFORK already set, and then use F_SETFD to set the FD_CLOEXEC flag on the new descriptor. (The alternative of first using F_DUPFD_CLOEXEC and then setting FD_CLOFORK with F_SETFD has a timing window where another thread could create a child process which inherits the new descriptor because FD_CLOFORK has not yet been set.)

The FD_CLOFORK flag takes effect for all child processes, not just those created using fork() or _Fork().

On page 897 line 30290 section fork(), change:
The child process shall have its own copy of the parent’s file descriptors.
to:
The child process shall have its own copy of the parent’s file descriptors, except for those whose FD_CLOFORK flag is set (see fcntl()).

On page 1319 line 43930 section mkdtemp(), after the bug 411 text:
O_CLOEXEC Set the FD_CLOEXEC file descriptor flag.
add:
O_CLOFORK Set the FD_CLOFORK file descriptor flag.

On page 1320 line 43980 section mkdtemp(), after applying bug 411 change:
The function mkostemp() with the O_CLOEXEC flag is necessary to avoid a data race in multi-threaded applications. Without it, a file descriptor is leaked into a child process created by one thread in the window between another thread creating a temporary file descriptor with mkstemp() and then using fcntl() to set the FD_CLOEXEC flag.
to:
The O_CLOEXEC and O_CLOFORK flags of mkostemp() are necessary to avoid a data race in multi-threaded applications. Without O_CLOFORK, a file descriptor is leaked into a child process created by one thread in the window between another thread creating a temporary file descriptor with mkstemp() and then using fcntl() to set the FD_CLOFORK flag. Without O_CLOEXEC, a temporary file descriptor intentionally inherited by child processes is similarly leaked into an executed program if FD_CLOEXEC is not set atomically.

On page 1408 line 46762 section open(), add:
The FD_CLOFORK file descriptor flag associated with the new file descriptor shall be cleared unless the O_CLOFORK flag is set in oflag.

On page 1408 line 46780 section open(), add:
O_CLOFORK
If set, the FD_CLOFORK flag for the new file descriptor shall be set.

On page 1408 line 47033 section open(), add:
The O_CLOEXEC and O_CLOFORK flags of open() are necessary to avoid a data race in multi-threaded applications. Without O_CLOFORK, a file descriptor is leaked into a child process created by one thread in the window between another thread creating a file descriptor with open() and then using fcntl() to set the FD_CLOFORK flag. Without O_CLOEXEC, a file descriptor intentionally inherited by child processes is similarly leaked into an executed program if FD_CLOEXEC is not set atomically.

On page 1430 line 47470 section pipe(), change:
The O_NONBLOCK and FD_CLOEXEC flags shall be clear on both file descriptors. (The fcntl() function can be used to set both these flags.)
to:
The FD_CLOEXEC and FD_CLOFORK flags shall be clear on both file descriptors. The O_NONBLOCK flag shall be clear on both open file descriptions. (The fcntl() function can be used to set this flag.)

On page 1430 line 47481 section pipe(), after the bug 411 text:
O_CLOEXEC
Atomically set the FD_CLOEXEC flag on both new file descriptors.
add:
O_CLOFORK
Atomically set the FD_CLOFORK flag on both new file descriptors.

On page 1431 line 47530 section pipe(), after applying bug 411 change:
The O_CLOEXEC flag of pipe2() is necessary to avoid a data race in multi-threaded applications. Without it, a file descriptor is leaked into a child process created by one thread in the window between another thread creating a file descriptor with pipe() and then using fcntl() to set the FD_CLOEXEC flag. The O_NONBLOCK flag is for convenience in avoiding additional fcntl() calls.
to:
The O_CLOEXEC and O_CLOFORK flags of pipe2() are necessary to avoid a data race in multi-threaded applications. Without O_CLOFORK, a file descriptor is leaked into a child process created by one thread in the window between another thread creating a file descriptor with pipe() and then using fcntl() to set the FD_CLOFORK flag. Without O_CLOEXEC, a file descriptor intentionally inherited by child processes is similarly leaked into an executed program if FD_CLOEXEC is not set atomically.

Since pipes are often used for communication between a parent and child process, O_CLOFORK has to be used with care in order for the pipe to be usable. If the parent will be writing and the child will be reading, O_CLOFORK should be used when creating the pipe, and then fcntl() should be used to clear FD_CLOFORK for the read side of the pipe. This prevents the write side from leaking into other children, ensuring the child will get end-of-file when the parent closes the write side (although the read side can still be leaked). If the parent will be reading and the child will be writing, there is no way to prevent the write side being leaked (short of preventing other threads from creating child processes) in order to ensure the parent gets end-of-file when the child closes the write side, and so the two processes should use an alternative method of indicating the end of communications.

Arranging for FD_CLOEXEC to be set appropriately is more straightforward. The parent should use O_CLOEXEC when creating the pipe and the child should clear FD_CLOEXEC on the side to be passed to the new program before calling an exec family function to execute it.

The O_NONBLOCK flag is for convenience in avoiding additional fcntl() calls.

On page 1437 line 47733 section popen(), after applying bug 411 change:
The popen() function shall ensure that any streams from previous popen() calls that remain open in the parent process are closed in the new child process, regardless of the FD_CLOEXEC status of the file descriptor underlying those streams.
to:
The popen() function shall ensure that any streams from previous popen() calls that remain open in the parent process are closed in the new child process, regardless of the FD_CLOEXEC or FD_CLOFORK status of the file descriptor underlying those streams.

On page 1437 line 47738 section popen(), after:
... shall be the readable end of the pipe.
add:
The FD_CLOFORK flag shall be cleared on both the STDOUT_FILENO file descriptor passed to the child process and the file descriptor underlying the returned stream.

On page 1437 line 47742 section popen(), after:
... shall be the writable end of the pipe.
add:
The FD_CLOFORK flag shall be cleared on both the STDIN_FILENO file descriptor passed to the child process and the file descriptor underlying the returned stream.

On page 1439 line 47807 section popen(), after the bug 411 text:
... any application worried about the potential file descriptor leak will already be using the e modifier.
add a new paragraph:
Implementations are encouraged to add support for a "wf" mode which creates the pipe as if by calling pipe2() with the O_CLOFORK flag and then clearing FD_CLOFORK for the read side of the pipe. This prevents the write side from leaking into child processes created by other threads, ensuring the child created by popen() will get end-of-file when the parent closes the write side (although the read side can still be leaked). Unfortunately there is no way (short of temporarily preventing other threads from creating child processes, or implementing an atomic create-pipe-and-fork system call) to implement an "rf" mode with the equivalent guarantee that the child created by popen() will be the only writer. Therefore multi-threaded applications that do not have complete control over process creation cannot rely on getting end-of-file on the stream and need to use an alternative method of indicating the end of communications.

On page 1450 line 48133 section posix_openpt(), after the bug 411 text:
O_CLOEXEC
Atomically set the FD_CLOEXEC flag on the file descriptor.
add:
O_CLOFORK
Atomically set the FD_CLOFORK flag on the file descriptor.

On page 1451 line 48179 section posix_openpt(), after applying bug 411 change:
The function posix_openpt() with the O_CLOEXEC flag is necessary to avoid a data race in multi-threaded applications. Without it, a file descriptor is leaked into a child process created by one thread in the window between another thread creating a file descriptor with posix_openpt() and then using fcntl() to set the FD_CLOEXEC flag.
to:
The O_CLOEXEC and O_CLOFORK flags are necessary to avoid a data race in multi-threaded applications. Without O_CLOFORK, a file descriptor is leaked into a child process created by one thread in the window between another thread creating a file descriptor with posix_openpt() and then using fcntl() to set the FD_CLOFORK flag. Without O_CLOEXEC, a file descriptor intentionally inherited by child processes is similarly leaked into an executed program if FD_CLOEXEC is not set atomically.

On page 1452 line 48235 section posix_spawn(), change:
If file_actions is a null pointer, then file descriptors open in the calling process shall remain open in the child process, except for those whose close-on-exec flag FD_CLOEXEC is set (see fcntl()).
to:
If file_actions is a null pointer, then file descriptors open in the calling process shall remain open in the child process, except for those whose FD_CLOEXEC or FD_CLOFORK flag is set (see fcntl()), and except for file descriptors that are closed by a fork handler (if fork handlers are called).

On page 1452 line 48240 section posix_spawn(), change:
If file_actions is not NULL, then the file descriptors open in the child process shall be those open in the calling process as modified by the spawn file actions object pointed to by file_actions and the FD_CLOEXEC flag of each remaining open file descriptor after the spawn file actions have been processed. The effective order of processing the spawn file actions shall be:
  1. The set of open file descriptors for the child process shall initially be the same set as is open for the calling process. The child process shall not inherit any file locks, but all remaining attributes of the corresponding open file descriptions (see fcntl()), shall remain unchanged.

  2. The signal mask, signal default actions, and the effective user and group IDs for the child process shall be changed as specified in the attributes object referenced by attrp.

  3. The file actions specified by the spawn file actions object shall be performed in the order in which they were added to the spawn file actions object.

  4. Any file descriptor that has its FD_CLOEXEC flag set (see fcntl()) shall be closed.

to:
If file_actions is not a null pointer, then the file descriptors open in the child process shall be those open in the calling process as modified by FD_CLOFORK file descriptor flags, fork handlers (if they are called), the spawn file actions object pointed to by file_actions, and the FD_CLOEXEC of each remaining open file descriptor after the spawn file actions have been processed. The effective order of processing the spawn file actions shall be:
  1. The set of open file descriptors for the child process shall initially be the same set as is open for the calling process, except for those that have the FD_CLOFORK flag set and any that are closed by fork handlers (if they are called).

  2. The child process shall not inherit any file locks, but all remaining attributes of the corresponding file descriptions (see fcntl()) still open, shall remain unchanged.

  3. The signal mask, signal default actions, and the effective user and group IDs for the child process shall be changed as specified in the attributes object referenced by attrp.

  4. The file actions specified by the spawn file actions object shall be performed in the order in which they were added to the spawn file actions object.

  5. Any file descriptor that has its FD_CLOEXEC flag set shall be closed.


On page 1456 line 48397 section posix_spawn(), change:
... expressed as the set of open file descriptors and their FD_CLOEXEC flags at the time of the call and the spawn file actions object specified in the call.
to:
... expressed as the set of open file descriptors and their FD_CLOEXEC and FD_CLOFORK flags at the time of the call, the actions of fork handlers (if they are called), and the spawn file actions object specified in the call.

On page 1461 line 48592 section posix_spawn_file_actions_addclose(), and
On page 1463 line 48702 section posix_spawn_file_actions_adddup2(), change:
In order to avoid a race condition of leaking an unintended file descriptor into a child process, an application should consider opening all file descriptors with the FD_CLOEXEC bit set unless the file descriptor is intended to be inherited across exec.
to:
In order to avoid a race condition of leaking an unintended file descriptor into a child process or executed program, an application should consider opening all file descriptors with the FD_CLOFORK or FD_CLOEXEC flag, or both flags, set unless the file descriptor is intended to be inherited by child processes or executed programs, respectively.

On page 1546 line 50637 section posix_typed_mem_open(), after the bug 411 text:
The FD_CLOEXEC file descriptor flag associated with the new file descriptor shall be cleared unless oflag includes O_CLOEXEC.
add:
The FD_CLOFORK file descriptor flag associated with the new file descriptor shall be cleared unless oflag includes O_CLOFORK.

On page 1546 line 50647 section posix_typed_mem_open(), after applying bug 411 change:
Additionally, the value of oflag may include the following flag:

O_CLOEXEC Set the FD_CLOEXEC file descriptor flag.
to:
Additionally, the value of oflag may include the following flags:

O_CLOEXEC Set the FD_CLOEXEC file descriptor flag.

O_CLOFORK Set the FD_CLOFORK file descriptor flag.

On page 1547 line 50678 section posix_typed_mem_open(), after applying bug 411 change:
The use of the O_CLOEXEC flag to posix_typed_mem_open() is necessary to avoid leaking typed memory file descriptors to child processes, since fcntl() has unspecified results on typed memory objects and therefore cannot be used to set FD_CLOEXEC after the fact.
to:
The use of the O_CLOEXEC and O_CLOFORK flags to posix_typed_mem_open() is necessary to avoid leaking typed memory file descriptors to child processes, since fcntl() has unspecified results on typed memory objects and therefore cannot be used to set FD_CLOEXEC or FD_CLOFORK after the file descriptor has been opened.

On page 1799 line 58230 section recvmsg(), after the bug 411 text:
MSG_CMSG_CLOEXEC
On sockets that permit a cmsg_type of SCM_RIGHTS in the msg_control ancillary data as a means of copying file descriptors into the process, the file descriptors shall be created with the FD_CLOEXEC flag atomically set.
add:
MSG_CMSG_CLOFORK
On sockets that permit a cmsg_type of SCM_RIGHTS in the msg_control ancillary data as a means of copying file descriptors into the process, the file descriptors shall be created with the FD_CLOFORK flag atomically set.

On page 1801 line 58306 section recvmsg(), after applying bug 411 change:
The use of the MSG_CMSG_CLOEXEC flag to recvmsg() when using SCM_RIGHTS to receive file descriptors via ancillary data is necessary to avoid a data race in multi-threaded applications. Without it, a file descriptor is leaked into a child process created by one thread in the window between another thread calling recvmsg() and using fcntl() to set the FD_CLOEXEC flag.
to:
The use of the MSG_CMSG_CLOEXEC and MSG_CMSG_CLOFORK flags to recvmsg() when using SCM_RIGHTS to receive file descriptors via ancillary data is necessary to avoid a data race in multi-threaded applications. Without MSG_CMSG_CLOFORK, a file descriptor is leaked into a child process created by one thread in the window between another thread calling recvmsg() and using fcntl() to set the FD_CLOFORK flag. Without MSG_CMSG_CLOEXEC, a file descriptor intentionally inherited by child processes is similarly leaked into an executed program if FD_CLOEXEC is not set atomically.

On page 2004 line 64479 section socket(), after the bug 411 text:
SOCK_CLOEXEC
Atomically set the FD_CLOEXEC flag on the new file descriptor.
add:
SOCK_CLOFORK
Atomically set the FD_CLOFORK flag on the new file descriptor.

On page 2005 line 64511 section socket(), after applying bug 411 change:
The use of the SOCK_CLOEXEC flag in the type argument of socket() is necessary to avoid a data race in multi-threaded applications. Without it, a file descriptor is leaked into a child process created by one thread in the window between another thread calling socket() and using fcntl() to set the FD_CLOEXEC flag.
to:
The use of the SOCK_CLOEXEC and SOCK_CLOFORK flags in the type argument of socket() is necessary to avoid a data race in multi-threaded applications. Without SOCK_CLOFORK, a file descriptor is leaked into a child process created by one thread in the window between another thread calling socket() and using fcntl() to set the FD_CLOFORK flag. Without SOCK_CLOEXEC, a file descriptor intentionally inherited by child processes is similarly leaked into an executed program if FD_CLOEXEC is not set atomically.

On page 2006 line 64553 section socketpair(), after the bug 411 text:
SOCK_CLOEXEC
Atomically set the FD_CLOEXEC flag on the new file descriptors.
add:
SOCK_CLOFORK
Atomically set the FD_CLOFORK flag on the new file descriptors.

On page 2007 line 64588 section socketpair(), after applying bug 411 change:
The use of the SOCK_CLOEXEC flag in the type argument of socketpair() is necessary to avoid a data race in multi-threaded applications. Without it, a file descriptor is leaked into a child process created by one thread in the window between another thread using socketpair() and using fcntl() to set the FD_CLOEXEC flag. The SOCK_NONBLOCK flag is for convenience in avoiding additional fcntl() calls.
to:
The use of the SOCK_CLOEXEC and SOCK_CLOFORK flags in the type argument of socketpair() is necessary to avoid a data race in multi-threaded applications. Without SOCK_CLOFORK, a file descriptor is leaked into a child process created by one thread in the window between another using socketpair() and using using fcntl() to set the FD_CLOFORK flag. Without SOCK_CLOEXEC, a file descriptor intentionally inherited by child processes is similarly leaked into an executed program if FD_CLOEXEC is not set atomically.

Since socket pairs are often used for communication between a parent and child process, SOCK_CLOFORK has to be used with care in order for the pair to be usable. If the parent will be writing and the child will be reading, SOCK_CLOFORK should be used when creating the pair, and then fcntl() should be used to clear FD_CLOFORK for the read side of the pair. This prevents the write side from leaking into other children, ensuring the child will get end-of-file when the parent closes the write side (although the read side can still be leaked). If the parent will be reading and the child will be writing, or if the socket pair will be used bidirectionally, there is no way to prevent the write side(s) being leaked (short of preventing other threads from creating child processes) in order to ensure the parent gets end-of-file when the child closes its side, and so the two processes should use an alternative method of indicating the end of communications, for example using shutdown().

Arranging for FD_CLOEXEC to be set appropriately is more straightforward. The parent should use SOCK_CLOEXEC when creating the socket pair and the child should clear FD_CLOEXEC on the side to be passed to the new program before calling an exec family function to execute it.

The SOCK_NONBLOCK flag is for convenience in avoiding additional fcntl() calls.

On page 2108 line 67621 section system(), change:
For example, file descriptors that have the FD_CLOEXEC flag set are closed, and ...
to:
For example, file descriptors that have the FD_CLOEXEC or FD_CLOFORK flag set are closed, and ...

On page 2163 line 69329 section tmpfile(), after applying bug 411 change:
Applications should instead use mkostemp() with the O_CLOEXEC flag, followed by fdopen(), to avoid the leak.
to:
Applications should instead use mkostemp() with the O_CLOEXEC or O_CLOFORK flag, or both, followed by fdopen(), to avoid the leak.


(0004802)
geoffclare   
2020-03-26 15:32   
In the March 26, 2020 teleconference Note: 0004797 was updated with further changes for accept().




Viewing Issue Advanced Details
1317 [1003.1(2016)/Issue7+TC2] System Interfaces Comment Enhancement Request 2020-01-11 12:40 2020-06-17 14:01
nate_karstens
 
normal  
Applied  
Accepted As Marked  
   
Nate Karstens
Garmin
system, popen, posix_spawn, etc.
Unknown
Unknown
Approved
See Note: 0004789
Require fork handlers to be called in certain conditions
Not defining whether fork handlers are called under certain scenarios can lead to undesired behavior and reduces the effectiveness of the pthread_atfork() interface.

Please see www.mail-archive.com/austin-group-l@opengroup.org/msg05324.html">https://www.mail-archive.com/austin-group-l@opengroup.org/msg05324.html [www.mail-archive.com/austin-group-l@opengroup.org/msg05324.html" target="_blank">^] for a description of the issue and resulting discussion.
In the definition of system(), change this:

It is unspecified whether the handlers registered with pthread_atfork() are called as part of the creation of the child process.

to this:

If the implementation of system() is non-atomic, then handlers registered with pthread_atfork() shall be called as part of the creation of the child process. If the implementation of system() is atomic , then it is unspecified whether the handlers registered with pthread_atfork() are called.

Add similar text to the definition of popen(), posix_spawn(), and any other interfaces that can fork/exec a child process without requiring the operation to be atomic.
Notes
(0004789)
nick   
2020-02-27 17:27   
(edited on: 2020-02-27 17:37)
Interpretation response
------------------------
The standard states that popen() must call atfork handlers, and it is unspecified if system() call atfork handlers, and conforming implementations must conform to this. However, concerns have been raised about this which are being referred to the sponsor.

Rationale:
-------------
Several existing implementations behave in different ways with respect to calling handlers, but this is important information for application developers.

Notes to the Editor (not part of this interpretation):
-------------------------------------------------------

At page 2107, line 67569 - 67570 section system(), change:
It is unspecified whether the handlers registered with pthread_atfork( ) are called as part of the creation of the child process.
to:
It is implementation-defined whether the handlers registered with pthread_atfork( ) are called as part of the creation of the child process.

At page 1437 line 47731 section popen(), change:
where shell path is an unspecified pathname for the sh utility.
to:
where shell path is an unspecified pathname for the sh utility. It is implementation-defined whether the handlers registered with pthread_atfork( ) are called as part of the creation of the child process.


(0004790)
geoffclare   
2020-02-27 17:38   
Just to circumvent any replies of "that doesn't solve the problem" - those of us on the teleconference are aware of that. The change to system() is not being made to address the reported problem, it is just to make system() consistent with posix_spawn() in requiring implementations to document whether these functions call fork handlers. (The change to popen() is for consistency with the change to system() between Issue 6 and Issue 7.)

The original problem cannot be solved by any change related to whether fork handlers are called, because implementations have extensions which a third-party library could use to fork a process without fork handlers being called. Such a facility is also being standardised in Issue 8 (the _Fork() function).
(0004834)
ajosey   
2020-04-30 15:25   
Interpretation proposed: 30 April 2020
(0004882)
ajosey   
2020-06-02 09:34   
Approved: 2nd June 2020




Viewing Issue Advanced Details
1314 [1003.1(2016)/Issue7+TC2] Base Definitions and Headers Comment Clarification Requested 2020-01-05 11:19 2020-03-13 12:11
dannyniu
 
normal  
Applied  
Accepted As Marked  
   
DannyNiu/NJF
<sys/stat.h>
Rationale
Paragraphs 5 and 6.
---
Note: 0004772
Should file stat uniqueness proposition be moved to normative parts?
The current text in the Rationales section of the <sys/stat.h> header have the following 2 paragraphs that appears to specify normative requirements but is being placed in informative section.

Note that st_dev must be unique within a Local Area Network (LAN) in a ``system'' made up of multiple computers' file systems connected by a LAN.

Networked implementations of a POSIX-conforming system must guarantee that all files visible within the file tree (including parts of the tree that may be remotely mounted from other machines on the network) on each individual processor are uniquely identified by the combination of the st_ino and st_dev fields.

Should they be moved up to normative section of the specification for the header?
Decide whether to move the 2 paragraph to normative section(s) or not.
Notes
(0004704)
shware_systems   
2020-01-06 16:24   
It's my understanding that these are in Rationale because details of such remote "mounts" are considered out of scope of the standard. It is more guidance to implementors about what may be needed when applications need to pay attention to those fields. Applications based on URLs, or other access varieties over a network, may not care at all about what values are in dev or ino fields.
(0004708)
kre   
2020-01-07 00:12   
Perhaps, in light of Note: 0004704 which seems reasonable, a solution might be
to expand the text in the rationale a little (or a bit more than a little)
to be something like

    This standard requires that any (st_dev, st_ino) tuple visible on a
    system uniquely identifies a particular file (or file like object)
    at the time the reference is produced. Implementations with the
    ability to combine file systems from multiple systems, each with their
    own uncoordinated (st_dev, st_ino) namespaces, such as network accessed
    file systems, must ensure that any such tuple received from some other
    system does not conflict with any such tuple from the local system or
    any other accessible remote system. The mechamism by which this is
    achieved is unspecified.
(0004718)
shware_systems   
2020-01-10 18:11   
While I agree with the thought of Note 4708, it does not address the problem exists with removeable media also. Two separate CD-ROMs, as example, may have the same ino values, that are not modifiable, referencing data. If the platform is caching stat data for multiple media, an additional media id value needs to be part of that tuple to uniquely identify a file, that I see.
(0004719)
kre   
2020-01-10 23:43   
Re Note: 0004718

First, we are not going to invent new mechanisms here, even if one were
wanted, so "an additional media id value" is not going to happen.

But you seem to have missed the "at the time the reference is produced"
part of the suggestion - nothing has ever expected dev/ino tuples to be
unique over all time, while the ino numbers on a filesystem tend to be
stable, even those can refer to different files over time (rm file1; > file2)
[nb: this is not expected to have file1 & file2 use the same ino number, but
it might].

But the dev part relates entirely to the way the device is connected to
the system, and can vary depending upon issues like the order in which
devices are attached, or which port they are connected to.

Whenever the filesystem attachments alter, the dev_t can alter as well,
systems which cache inodes need to flush that cache whenever the data
in it would (or might) become invalid - such as when ejecting a CD, or
unplugging a USB stick. As long as that is done correctly, a particular
dev/ino tuple will uniquely identify a file (or not exist at all) at one
instant of time - and if we're lucky (nothing changes in the filesystems)
for some time after.
(0004722)
dannyniu   
2020-01-12 03:07   
(edited on: 2020-01-14 03:20)
Very true of Kre, the uniqueness can only guaranteed at an instant of time.

Can we put something like the following into the normative parts, I've thinked of the wording for some time:

> At any given time in a system, files of distinct identities shall have (st_dev,st_ino) tuples of distinct values; the same file or hard links to the same file shall have the same (st_dev,st_ino) value.

I'm quite satisfied with the part before the semicolon, the 2nd part might need some work I think.

(0004772)
geoffclare   
2020-02-06 17:12   
On page 392 line 13342 section <sys/stat.h>, change:
The st_ino and st_dev fields taken together uniquely identify the file within the system.
to:
A file identity is uniquely determined by the combination of st_dev and st_ino. At any given time in a system, distinct files shall have distinct file identities; hard links to the same file shall have the same file identity. Over time, these file identities can be reused for different files. For example, the st_ino value can be reused after the last link to a file is unlinked and the space occupied by the file has been freed and the st_dev value associated with a file system can be reused if that file system is detached ("unmounted") and another is attached ("mounted").


On page 2197 line 70223 section unlink(), add the following:
When the space occupied by the file has been freed, the file's serial number (st_ino), and therefore the file identity (see [xref to <sys/stat.h>]), shall become available for reuse.




Viewing Issue Advanced Details
1313 [1003.1(2016)/Issue7+TC2] System Interfaces Editorial Error 2020-01-02 13:57 2020-02-26 12:04
dennisw
 
normal  
Applied  
Accepted As Marked  
   
Dennis Wölfing
strftime
2049
65729
---
See Note: 0004765.
Underline tags in strftime Application Usage
The APPLICATION USAGE section of strftime contains <Underline> tags around Y instead of underlining it.
This was perhaps originally a formatting hint to the editor that was applied literally.
On page 2049 line 65729 section strftime, change
(<+/-><Underline>Y</Underline>YYYY-MM-DD)
to
(<+/->YYYYY-MM-DD)
Notes
(0004765)
Don Cragun   
2020-02-03 16:30   
(edited on: 2020-02-03 16:33)
On page 2049 line 65729 section strftime, change:
    (<+/-><Underline>Y</Underline>YYYY-MM-DD)
to:
    (<+/->YYYYY-MM-DD, i.e. with a 5 or more digit year)





Viewing Issue Advanced Details
1312 [1003.1(2016)/Issue7+TC2] Shell and Utilities Editorial Error 2019-12-20 10:49 2020-02-26 12:00
andras_farkas
 
normal  
Applied  
Accepted  
   
Andras Farkas
ctags
2625
85386
---
ctags -v example in ctags's rationale section missing a newline
In the rationale section of ctags ( https://pubs.opengroup.org/onlinepubs/9699919799/utilities/ctags.html [^] ) there's an example about ctags -v and vgrind:

ctags -v files | sort -f > index vgrind -x index

It's missing a newline which is required for example to run. (alternatively a ; or && could be used)
Simply change it to the following, by adding a newline or BR HTML tag:

ctags -v files | sort -f > index
vgrind -x index
Notes
(0004721)
andras_farkas   
2020-01-12 01:46   
Ping.
:)




Viewing Issue Advanced Details
1311 [1003.1(2016)/Issue7+TC2] Shell and Utilities Editorial Error 2019-12-20 10:09 2020-02-26 11:56
andras_farkas
 
normal  
Applied  
Accepted  
   
Andras Farkas
ed
2689
87741
---
j command incorrectly referred to in ed's rationale section
In the rationale section on ed's page ( https://pubs.opengroup.org/onlinepubs/9699919799/utilities/ed.html [^] ) we see:
On BSD, the join command with only a single address changes the current line to that address.

In the HTML:
On BSD, the [a href="../utilities/join.html"]join[/a] command with only a single address changes the current line to that address.

This link to the join utility is incorrect. ed's j command is intended.

(unrelatedly, it's odd that [p] is being used within [li] when LI elements already accept text content in HTML)

(Use of '[' instead of '<' in the above is to stop Mantis interpreting as HTML)
The text should be changed as follows:
On BSD, the j command with only a single address changes the current line to that address.

The HTML should be changed as follows:
On BSD, the [b]j[/b] command with only a single address changes the current line to that address.

(Use of '[' instead of '<' in the above is to stop Mantis interpreting as HTML)
Notes
(0004694)
andras_farkas   
2019-12-20 10:11   
The defect tracker swallowed up the HTML (which I didn't intend to have rendered) and I can't figure out how to edit the issue.
(0004695)
shware_systems   
2019-12-20 10:27   
Mantis understands a subset of HTML tags, so when you want them to show as plain text you need to add a SPC or '_' after the '<' opening each tag. Except for the last part what's there now is understandable anyways.
(0004696)
geoffclare   
2019-12-20 10:29   
(edited on: 2019-12-20 10:30)
I have fixed the HTML problem in description and desired action by changing '<' to '['.

(0004697)
andras_farkas   
2019-12-20 10:33   
I see. Thanks! :D
(0004720)
andras_farkas   
2020-01-12 01:46   
Ping.
(0004788)
geoffclare   
2020-02-26 11:56   
Applied to the troff source. The change will come through into the HTML version next time a new PDF is published. (For now, the hyperlink has been removed so it just has the word join to match the current PDF.)




Viewing Issue Advanced Details
1310 [1003.1(2016)/Issue7+TC2] Shell and Utilities Objection Error 2019-12-19 17:49 2020-03-13 12:10
geoffclare
 
normal  
Applied  
Accepted  
   
Geoff Clare
The Open Group
vi
3416
115215
---
CONSEQUENCES OF ERRORS for vi tries to define undefined behaviour
The CONSEQUENCES OF ERRORS section for vi has text beginning:
When any error is encountered and the standard input is not a terminal device file, ...

However, the STDIN section states "If standard input is not a terminal device, the results are undefined."
Change:
When any error is encountered and the standard input is not a terminal device file, vi shall not write the file or return to command or text input mode, and shall terminate with a non-zero exit status.

Otherwise, when ...
to:
When ...
Notes
(0004707)
kre   
2020-01-06 23:57   
An alternative resolution might be to change line 113704 (page 3375)
(this is the line quoted in the Description of this issue from the STDIN
section) so that it says something like:

    If standard input is not a terminal device, and no error is detected,
    the results are undefined. If an error occurs, see below in the
    CONSEQUENCES OF ERRORS for the appropriate result.

Note: do not read this as a strong preference for this version over the
Desired Action, I just have a general preference for less undefined cases
rather than more whenever it is possible. Here I am not sure what actual
vi implementations do when input is not a terminal (when some error occurs).

Also, does this really need (when there is no error, and perhaps when there
is) to be undefined, rather than unspecified?




Viewing Issue Advanced Details
1309 [1003.1(2016)/Issue7+TC2] Shell and Utilities Objection Enhancement Request 2019-12-19 02:26 2020-05-05 15:18
kre
 
normal  
Applied  
Accepted As Marked  
   
Robert Elz
2.9.4
2371-4
75726-31
Approved
Note: 0004763
Clarity needed for initial value of $? at start of compound-list compound statements
Currently, nothing says what the "previous command" is precisely when beginning execution of one of the lists in the body of a compound statement.

This is (fortunately) not controversial, as best I can tell, all shells do
it the same way, so all that is needed is to be explicit in the standard as
to what that way is.

That is, in

    (exit $B); if X=$?; echo X=$X; (exit $X); then Y=$?; echo Y=$Y; (exit $Y);
    else Z=$?; echo Z=$Z; (exit $Z); fi; A=$?; echo A=$A; (exit $A)

what values are assigned to X, Y, and Z (the value for A is already defined)

The reason this might seem tricky, is that when assigning A, the value of
X is completely ignored. Some might reason from that, that the value of
X should be ignored for the purposes of Y and Z as well.

Similar examples can be designed for all of the compound statements.

[Aside: do not treat this example literally - it is obviously designed to
pass through the exit code without changing it ... for more realistic purposes
we should assume that the "echo $Q; (exit $Q)" part (where Q is X Y or Z)
can be any arbitrary sh statements, producing any arbitrary results.]

My intuition would say (and all the tests I have done confirm) that the
"previous" status for X is always the previous command exit status (from
before the compound statement, or 0 if there was none) and that for Y and Z
it is the result from the condition evaluation (if there was one) or for
the compounds that have no condition ('{' '(' 'for' 'case') the initial
exit status for the body is that from before the compound statement.

The two cases where it is possible to observe this in a meaningful way are
in the else clause list of an if statement, and the body of an until

That is, if we have
    (exit 3) ; if (exit 7); then : ; else echo $?; fi
then what will be printed will be 7, and definitely not 3 or 0.

I believe that this should be made explicit in the standard.
After line 75731 (the end of the 2.9.4 intro test, just before 2.9.4.1
add a new paragraph something like:

When commencing execution of a compound-list as part of a compound command,
the "last command executed" for the purposes of determining the value of the
special parameter '?' and the default value for the exit and resturn special
built-in commands, shall be the status of the last command executed before the
compound command in the case of the first compound-list executed, and the result
of the previous compound-list executed as part of executing the compound
command in the case where more that one compound-list is executed, or where
compound-lists are repeatedly executed. During execution of a compound-list
the exit status is updated to reflect the results of each command executed,
as is defined for each such command.

The exit status after the compound command has completed is as specified
below for each specific compound command.
Notes
(0004731)
geoffclare   
2020-01-16 17:42   
(edited on: 2020-01-16 17:43)
[These are notes I made as a result of discussions in the 2020-01-16 teleconference, but they aren't a formal record, just my personal take on what came out of the discussion]

Regarding:

| That is, if we have
| (exit 3) ; if (exit 7); then : ; else echo $?; fi
| then what will be printed will be 7, and definitely not 3 or 0.

This is clearly already required by the current text in the standard. When "echo $?" is executed, the last command that was executed is the "(exit 7)" subshell and its exit status was 7 so $? must be 7.

As far as I can see there is no ambiguity for until (or while) either. This covers the two cases where kre said it is possible to observe this, and since all shells behave as expected I don't see the need to change the standard.

However, there is one additional aspect that is of interest, and that's what happens on entry to a subshell.

2.12 says "A subshell environment shall be created as a duplicate of the shell environment, except [stuff to do with signal traps]" This means the value of $? must be inherited, and all shells seem to do that. However, as regards a "last command", it is not clear whether the last command in the shell that executed the subshell should be treated as "inherited". Most shells behave as if it is, but ksh does not; it behaves as if there were no last command:
$ for shell in bash dash ksh zsh; do printf "$shell: "; $shell -c 'false; (exit ); echo $?
'; done 
bash: 1
dash: 1
ksh: 0
zsh: 1


(0004732)
kre   
2020-01-16 20:35   
Wrt the last point of Note: 0004731 for me, ksh93 (Version AJM 93u+ 2012-08-01
anyway) gives '1' (as do bosh, fbsh, yash, mksh, pdksh, and nbsh).

But that suggests another question, In 2..5.2, line 74877 :

   ? Expands to the decimal exit status of the most recent pipeline
     (see Section 2.9.2).

[2.9.2 defines a pipeline] doesn't say what $? should be if there has been
no "most recent pipeliine" and a not-unreasonable interpretation might be
that $? should be unset in that case (as, for example, $! is when there js
been no "most recent background command". Or at least most shells leave $!
unset in that case, yash and zsh produce 0 for $! in

    $SHELL -c 'printf "! %s ? %s\\n" ${!-unset} ${?-unset}'

where all shells I tested produce 0 for $?. This isn't stated in the
standard one way or the other either, but I have seen several scripts that
assume the unset value for $! in that case, and 0 for $? (the common values).

The reason I suggested some clarity for the if/while/until commands, is
because generally the exit status of the condition is not available other
than as it determins whether or not (or which in the case of if with an
else) list is executed next, and that could be mis-interpreted to mean
that inside the list, the "last pipeline executed" (for $?) or "last command
executed" for exit or return might mean the one executed prior to beginning
the if/while/until statement, rather that temporally.

And with that, why is $? defined to be the last "pipeline" whereas exit &
return use the status of the last "command" (and explicitly say it is 0 if
there has been none).

I still believe that we ought to some kind of cleanup of all of this, and
make the language precise.
(0004733)
kre   
2020-01-16 21:36   
I perhaps should also add, that $! is one of those things that (as best I can
tell) all shells treat as part of the "duplicate of the shell environment"
when creating a sub-shell, even though the value of $! in the subshell is
useless, and it would have arguably been better to revert the special param !
to be unset when a sub-shell is started.

That is, in

     $SHELL -c 'sleep 5 & printf %d\\n $!; ( printf "%s\\n" ${!-unset} )'

all the shells I regularly test print the same value from both printf
statements (I use %d the first time, as there we know $! will have an
integer value, %s the second, in case we were to get "unset").
(0004734)
kre   
2020-01-17 04:17   
(edited on: 2020-01-17 04:19)
A variation on the original command (nb: this is all to hopefully get agreement
that we should improve the precision of the relevant wording, not to suggest any
different behaviour than what is commonly expected)

    (exit 3) ; if (exit 7); then : ; fi; echo $?

clearly is intended to produce 0 from the echo, not 3 or 7 (and does
everywhere), but which is the "most recent pipeline", the "if" command
was started before the (exit 7) - it had to be, or the latter would not
have been run at all, and they both completed simultaneously (when the
(exit 7) finished the "if" had nothing more to do, as there is no else
clause and the condition is false.

Or another weirder case

    (exit 5) ; $SHELL -c 'echo $?'

when the echo happens, the most recently completed pipeline is
the (exit 5) - so does the standard require '5' as the output?
Obviously not, but where exactly does it say that? And what
words are correct so this one produces 0 as it should, and yet

   $SHELL -c '(exit 5) ; (echo $?)'

still produces 5 - we cannot merely insert "current execution
environment" or even "current shell process" even if the XCU
permitted us to mention processes.

(0004735)
joerg   
2020-01-17 09:56   
Are you shure for ksh93u+? I get 0 even for ksh93v.
(0004736)
kre   
2020-01-17 10:31   
Joerg, you're right, my test setup was borked, somehow in the ksh93 setup
(and just that one) $SHELL is getting altered to /bin/sh - so when I
thought I was testing ksh93 I was actually testing a slightly old
version (well, several years old now) of the NetBSD sh. Sorry
about that (the window in which I ran the command was running ksh93,
so when I checked the version that was correct, just $SHELL was incorrect,
so the wrong shell ran the command).

With SHELL set correctly I get 0 from that test case too.

Now I need to work out what in the startup sequence is altering the $SHELL
that is in the environment (and is the shell started).
(0004737)
geoffclare   
2020-01-17 15:39   
(edited on: 2020-01-17 16:17)
Re: Note: 0004732 the wording difference "most recent pipeliine" v. "last command" goes all the way back to POSIX.1992. (I was hoping one of them had changed and I'd be able to point to why it changed.) I agree it would be good to make these consistent.

Incidentally the latest text for $? (with bugs applied) is "... most recent pipeline (see ...) that was not within a command substitution (see ...)". Perhaps the part about command substitution should also apply to exit and return.

The value of $? if no pipeline has been executed is currently unspecified. If an application relies on it being 0, that's a bug in the application. Having said that, I would not object to changing the standard to require it to be 0 on entry to the shell (but not on entry to a subshell).

Re: Note: 0004733 The uselessness of inheriting $! is similar to the situation for $$. They are a consequence of a subshell originally being created simply by forking.

Re: Note: 0004734 It seems absurd to me to claim that the (exit 7) and the "if" complete simultaneously. The "if" command has to use the exit status of the (exit 7) in order to decide whether to execute the ":". Thus is must perform some processing after the (exit 7) has completed.

Regarding "(exit 5) ; $SHELL -c 'echo $?'" I really can't see any reader of the standard having a problem understanding that there is no relationship here between the (exit 5) and the $?. (And see above where I talk about the value of $? if no pipeline has been executed.)

(0004738)
joerg   
2020-01-17 15:53   
(edited on: 2020-01-17 16:04)
I am not aware of any shell that does not start with $? being 0.

The problem in ksh is that it inherits $? in a sub-shell from the creator of the sub-shell, but at the same time replaces an exit without parameter (if it is the first command in that sub-shell) by exit 0, even in case that $? is != 0.

I would call this at least unexpected.

(0004739)
kre   
2020-01-18 02:38   
Re Note: 0004737

What I'd suggest for exit/return definitions vs $? rather than attempting
to unify the language, is to first ensure that $? is always defined (no
unspecified cases, none is needed) and then simply make the default 'n'
for exit/return be $?

With that there is just one definition, and if it ever needs correction,
everything gets corrected together.

For $! in a subshell, I know nothing can be done, but I would not equate it
to $$, having $$ work the way it does is useful. The problem is that there
is no standard way to obtain the pid of the current executing (sub-)shell.
Many shells provide a mechanism for this (which is also useful) but there
is no common way yet, so nothing that could be standardised.

However $! could reasonably be specified to be unset before any async command
has been started ... when yash differs from every other shell (zsh excluded,
as it does not really try all that hard to be compatible) it is generally a
sign that the standard is lacking, as yash (as I understand it) was built
to implement the standard, whereas everyone else has attempted to build shells
compatible with the original (which is what the standard should be defining).
When yash does something different than everything else, the normal cause
will be that the standard forgot to specify something - which appears to be
the case here, and we should simply fix it.

It doesn't matter whether the "if" and "(exit 7)" complete at the same time or
not, what matters is whether "most recent" means most recently started, or
most recently completed, which isn't specified anywhere. Once again, what
the results should be isn't in question - it is simply a matter of specifying
it all correctly, and no "it is obvious" is not good enough.

Nor is it good enough to "can't really see" - we need to be precise in all
of this, so mis-interpretation is not just unreasonable or absurd, but
impossible.
(0004741)
geoffclare   
2020-01-20 11:57   
Re: Note: 0004739 Good idea to put the detail in the $? description and have exit and return just refer to that.

Regarding the wording to use, I think using "pipeline" is correct. The use of "last command" is ambiguous because in the case of:

command1 | command2

if command1 completes after command2 this could be taken as requiring the exit status of command1 to be used.

I will try and come up with some proposed wording before today's teleconference.
(0004742)
geoffclare   
2020-01-20 14:54   
(edited on: 2020-01-20 14:55)
Proposed changes:

On page 2350 line 74877 section 2.5.2 Special Parameters, after applying bug 1150 change:
Expands to the decimal exit status of the most recent pipeline (see [xref to 2.9.2]) that was not within a command substitution (see [xref to 2.6.3]).
Note: In <tt>var=$(some_command); echo $?</tt> the output is the exit status of <tt>some_command</tt> but this is because its exit status becomes the exit status of the assignment command <tt>var=$(some_command)</tt> (see [xref to 2.9.1]) and this assignment command is the most recent pipeline.
to:
Expands to the decimal exit status of the pipeline (see [xref to 2.9.2]) that most recently completed execution and was not executed in a subshell environment. The value shall be set to 0 during initialization of the shell. When a subshell environment is created, it is unspecified whether the value of the special parameter '?' from the invoking shell environment is preserved in the subshell or the value is reset to 0.
Note: In <tt>var=$(some_command); echo $?</tt> the output is the exit status of <tt>some_command</tt>, which is executed in a subshell environment, but this is because its exit status becomes the exit status of the assignment command <tt>var=$(some_command)</tt> (see [xref to 2.9.1]) and this assignment command is the most recently completed pipeline.

On page 2399 line 76788 section 2.14 exit, change:
The exit status shall be n, if specified, except that the behavior is unspecified if n is not an unsigned decimal integer or is greater than 255. Otherwise, the value shall be the exit value of the last command executed, or zero if no command was executed. When exit is executed in a trap action, the last command is considered to be the command that executed immediately preceding the trap action.
to:
The exit status shall be n, if specified, except that the behavior is unspecified if n is not an unsigned decimal integer or is greater than 255. If n is not specified, the result shall be as if n were specified with the current value of the special parameter '?' (see [xref to 2.5.2]), except that when exit is executed in a trap action, the value for the special parameter '?' that is considered ``current'' shall be the value it had immediately preceding the trap action.

On page 2407 line 77039 section 2.14 return, change:
The value of the special parameter '?' shall be set to n, an unsigned decimal integer, or to the exit status of the last command executed if n is not specified. If n is not an unsigned decimal integer, or is greater than 255, the results are unspecified. When return is executed in a trap action, the last command is considered to be the command that executed immediately preceding the trap action.
to:
The exit status shall be n, if specified, except that the behavior is unspecified if n is not an unsigned decimal integer or is greater than 255. If n is not specified, the result shall be as if n were specified with the current value of the special parameter '?' (see [xref to 2.5.2]), except that when return is executed in a trap action, the value for the special parameter '?' that is considered ``current'' shall be the value it had immediately preceding the trap action.


(0004743)
kre   
2020-01-20 18:37   
This (Note: 0004742) is mostly all good (the two minor issues I have with it are
just below) but note that this is really a side issue that developed from
this bug report, I still believe that it would be good to add (and certainly
harmless) some words to the effect of

    When the list begins execution, the value of the special parameter ?
    shall be that produced by evaluation of the condition-list that
    immediately preceded this execution of the list

applied to if/while/until statements (or somewhere common and stated to apply
to all such commands) - that is, to clarify that even though the exit status
($?) from the condition list is not available after one of these compound
commands completes (regardless of being the last command executed as part of
the compound command) its exit status is available inside the code that is the
body of the compound command.



My reservations with the wording in the note relate first to this part

    Note: In <tt>var=$(some_command); echo $?</tt> the output is the exit
    status of <tt>some_command</tt>,

which is all fine, until

    its exit status becomes the exit status of the assignment command
    <tt>var=$(some_command)</tt>

There is no such thing as an "assignment command". What there is is
variable assignments, and null commands, What 2.9.1 says is ...

    If there is no command name, but the command contained a command
    substitution, the command shall complete with the exit status of the
    last command substitution performed. Otherwise, the command shall
    complete with a zero exit status.

which does not invent an "assignment command" - but even if it did would
not cover cases like

    umask 0; >/tmp/foo; # this is just to create a known environment
    </tmp/foo$(exit 1)

which in all shells except bash and zsh (all I have tested anyway) results
in $? being set to 1 - since on that 2nd line, there is no command name, but
there is a command substitution, so the status of the last of those (here there
is just one) becomes the status of this empty command (as 2.9.1 directs).

2.9.1 also says:

    If there is no command name, any redirections shall be performed in a
    subshell environment; it is unspecified whether this subshell environment
    is the same one as that used for a command substitution within the command.

which, since it says "unspecified" might seem to be relevant, but I don't
believe it is, whether the redirect happens in the same subshell as the
command substitution isn't really relevant, there is no command name
(except exit, which is the command substitution command) and so the exit
status here should be the status of the (last) command substitution performed.

Anyway, inventing a fictional assignment command doesn't help, and I don't
think is needed - I don't believe any reference to
     x=$(whatever)
is needed in this context at all. What is the exit status of various commands
(including simple commands without command names, which is what this is) is
specified elsewhere, and doesn't need to also be specified in the definition
of the ? special parameter.



Second reservation: there is no need for

    When a subshell environment is created, it is unspecified whether the
    value of the special parameter '?' from the invoking shell environment
    is preserved in the subshell or the value is reset to 0.

It is preserved, and if we need say anything at all (I have no issues with
saying something here, more clarity is helpful) it should be

    When a subshell environment is created, the value of the special parameter
    ? shall initially be the value it had in the shell environment immediately
    preceding the creation of the subshell environment.

(or words something like that which better meet the standard language).
Perhaps just:

    Creating a subshell environment does not alter the value of the
    special parameter ?

Joerg already cleared up the apparent issue with ksh and the strange 0
status from the test - it is not $? that is incorrect there, but "exit"
and that is an obvious bug, and not worthy of any mention in the standard
at all.

Note that all shells print 1 for

    (exit 1); (echo $?)

including ksh93 (and I presume ksh88). That makes it clear that it is
not the subshell environment that is causing the 0 from

    (exit 1); (exit) ; echo $?

in ksh, but a bug in the exit command (my guess would be that it is
attempting to do what the standard seems to say, looking for the status of
the "last command" and failing to find one in its current environment,
and consequently defaulting to 0. That's bogus, but practically harmless,
as no-one in real code writes a subshell in which the first command is
exit (without a specific exit value) as all that would be is an expensive
no-op.
(0004744)
geoffclare   
2020-01-23 14:55   
(edited on: 2020-01-23 15:01)
Update to proposed changes, following email discussion of Note: 0004743 ...

On page 2350 line 74877 section 2.5.2 Special Parameters, after applying bug 1150 change:
Expands to the decimal exit status of the most recent pipeline (see [xref to 2.9.2]) that was not within a command substitution (see [xref to 2.6.3]).
Note: In <tt>var=$(some_command); echo $?</tt> the output is the exit status of <tt>some_command</tt> but this is because its exit status becomes the exit status of the assignment command <tt>var=$(some_command)</tt> (see [xref to 2.9.1]) and this assignment command is the most recent pipeline.
to (OPTION 1):
Expands to the decimal exit status of the pipeline (see [xref to 2.9.2]) that most recently completed execution and was not executed in a subshell environment. The value shall be set to 0 during initialization of the shell. When a subshell environment is created, it is unspecified whether the value of the special parameter '?' from the invoking shell environment is preserved in the subshell or the value is reset to 0.
Note: In <tt>var=$(some_command); echo $?</tt> the output is the exit status of <tt>some_command</tt>, which is executed in a subshell environment, but this is because its exit status becomes the exit status of the assignment command <tt>var=$(some_command)</tt> (see [xref to 2.9.1]) and this assignment command is the most recently completed pipeline. Likewise for any pipeline consisting entirely of a simple command that has no command word, but contains one or more command substitutions. (See [xref to 2.9.1].)
or to (OPTION 2):
Expands to the decimal exit status of the pipeline (see [xref to 2.9.2]) that most recently completed execution and was not executed in a subshell environment. The value shall be set to 0 during initialization of the shell. When a subshell environment is created, the value of the special parameter '?' from the invoking shell environment shall be preserved in the subshell.
Note: In <tt>var=$(some_command); echo $?</tt> the output is the exit status of <tt>some_command</tt>, which is executed in a subshell environment, but this is because its exit status becomes the exit status of the assignment command <tt>var=$(some_command)</tt> (see [xref to 2.9.1]) and this assignment command is the most recently completed pipeline. Likewise for any pipeline consisting entirely of a simple command that has no command word, but contains one or more command substitutions. (See [xref to 2.9.1].)

On page 2371 line 75731 section 2.9.4 Compound Commands, add a new paragraph:
In the descriptions below, the exit status of some compound commands is stated in terms of the exit status of a compound-list. The exit status of a compound-list shall be the value that the special parameter '?' (see [xref to 2.5.2]) would have immediately after execution of the compound-list.

On page 2372 line 75766 section 2.9.4.2 The for Loop, change:
The exit status of a for command shall be the exit status of the last command that executes.
to:
If there is at least one item in the list of items, the exit status of a for command shall be the exit status of the last compound-list executed.

On page 2373 line 75793 section 2.9.4.3 Case Conditional Construct, change:
... the exit status shall be the exit status of the last command executed in the compound-list.
to:
... the exit status shall be the exit status of the executed compound-list.

On page 2373 line 75814 section 2.9.4.4 The if Conditional Construct, add:
Note: Although the exit status of the if or elif compound-list is ignored when determining the exit status of the if command, it is available through the special parameter '?' (see [[xref to 2.5.2]) during execution of the next then, elif, or else compound-list (if any is executed) in the normal way.

On page 2374 line 75827 section 2.9.4.5 The while Loop, add:
Note: Since the exit status of compound-list-1 is ignored when determining the exit status of the while command, it is not possible to obtain the status of the command that caused the loop to exit, other than via the special parameter '?' (see [[xref to 2.5.2]) during execution of compound-list-1, for example: <tt>while some_command; st=$?; false; do ...</tt>. The exit status of compound-list-1 is available through the special parameter '?' during execution of compound-list-2, but is known to be zero at that point anyway.

On page 2374 line 75840 section 2.9.4.6 The until Loop, add:
Note: Although the exit status of compound-list-1 is ignored when determining the exit status of the until command, it is available through the special parameter '?' (see [[xref to 2.5.2]) during execution of compound-list-2 in the normal way.

On page 2399 line 76788 section 2.14 exit, change:
The exit status shall be n, if specified, except that the behavior is unspecified if n is not an unsigned decimal integer or is greater than 255. Otherwise, the value shall be the exit value of the last command executed, or zero if no command was executed. When exit is executed in a trap action, the last command is considered to be the command that executed immediately preceding the trap action.
to:
The exit status shall be n, if specified, except that the behavior is unspecified if n is not an unsigned decimal integer or is greater than 255. If n is not specified, the result shall be as if n were specified with the current value of the special parameter '?' (see [xref to 2.5.2]), except that when exit is executed in a trap action, the value for the special parameter '?' that is considered ``current'' shall be the value it had immediately preceding the trap action.

On page 2407 line 77039 section 2.14 return, change:
The value of the special parameter '?' shall be set to n, an unsigned decimal integer, or to the exit status of the last command executed if n is not specified. If n is not an unsigned decimal integer, or is greater than 255, the results are unspecified. When return is executed in a trap action, the last command is considered to be the command that executed immediately preceding the trap action.
to:
The exit status shall be n, if specified, except that the behavior is unspecified if n is not an unsigned decimal integer or is greater than 255. If n is not specified, the result shall be as if n were specified with the current value of the special parameter '?' (see [xref to 2.5.2]), except that when return is executed in a trap action, the value for the special parameter '?' that is considered ``current'' shall be the value it had immediately preceding the trap action.


(0004763)
geoffclare   
2020-01-30 17:20   
Interpretation response
------------------------
The standard is unclear on this issue, and no conformance distinction can be made between alternative implementations based on this. This is being referred to the sponsor.

Rationale:
-------------
Wording such as "last command" is imprecise.

Notes to the Editor (not part of this interpretation):
-------------------------------------------------------
Implement Note: 0004744 choosing option 2 in the first part.
(0004766)
ajosey   
2020-02-03 21:13   
Interpretation Proposed: 3 February 2020
(0004801)
ajosey   
2020-03-23 15:25   
Interpretation approved: 23 March 2020




Viewing Issue Advanced Details
1308 [1003.1(2016)/Issue7+TC2] Shell and Utilities Editorial Error 2019-12-18 23:28 2020-02-26 11:46
Antonio Diaz
 
normal  
Applied  
Accepted  
   
Antonio Diaz Diaz
GNU Project
ed
2691
87839
---
See Note: 0004730
Error in table of addresses for address 7,+
In the table of addresses at the end of the ed page we can see (among other rows):

Address Addr1 Addr2
7,+ 7 8
7;+ 7 8

But the first row above should be equivalent to '7,.+', not to '7,8'.
See http://lists.gnu.org/archive/html/bug-ed/2018-10/msg00005.html [^] for an explanation and some tests.
Change the Addr2 column for address '7,+' to '.+' like this:

Address Addr1 Addr2
7,+ 7 .+
Notes
(0004730)
nick   
2020-01-16 16:16   
Reformatting description and desired action to aid in reading:

Description:

In the table of addresses at the end of the ed page we can see (among other rows):
Address  Addr1  Addr2
7,+       7      8
7;+       7      8

But the first row above should be equivalent to '7,.+', not to '7,8'.
See http://lists.gnu.org/archive/html/bug-ed/2018-10/msg00005.html [^] for an explanation and some tests.

Desired Action:

Change the Addr2 column for address '7,+' to '.+' like this:
Address  Addr1  Addr2
7,+       7      .+






Viewing Issue Advanced Details
1307 [1003.1(2016)/Issue7+TC2] Base Definitions and Headers Comment Clarification Requested 2019-12-18 15:35 2020-05-05 15:08
geoffclare
 
normal  
Applied  
Accepted As Marked  
   
Geoff Clare
The Open Group
7.3.5.1 LC_TIME Locale Definition
160
5085
Approved
See Note: 0004762.
am_pm value in locales that do not distinguish between am and pm (again)
The Notes to the Editor in the Interpretation response in bug 0000081 say:
Consider as a revision for a future edition, requiring that
am_pm be empty if t_fmt_ampm is an empty string

This would also need to be noted in the APPLICATION USAGE of related
utilities/functions.

This is an instruction to the working group to consider making a change, not an instruction to the editor to make a change. I am submitting this bug so that we can duly consider precisely what, if any, changes to make.
Change:
The operand shall consist of two strings, separated by a <semicolon>, each surrounded by double-quotes. The first string shall represent the ante-meridiem designation, the last string the post-meridiem designation.
to:
If the t_fmt_ampm string is not empty, the am_pm operand shall consist of two strings, separated by a <semicolon>, each surrounded by double-quotes; the first string shall represent the ante-meridiem designation, the last string the post-meridiem designation. If the t_fmt_ampm string is empty, the am_pm operand shall be an empty string.

If this change is agreed, then work out what other changes are needed.
Notes
(0004688)
geoffclare   
2019-12-18 15:38   
Since bug 0000081 does not contain details of a change to be applied in Issue 8, I have removed the issue8 tag from it. It still stands as a formal Interpretation response to the issue that was raised in that bug.
(0004689)
shware_systems   
2019-12-18 18:54   
(edited on: 2019-12-18 19:08)
I'd be more in favor of requiring, at a minimum, "";"" be the default value for am_pm, so whatever processing using that sees zero length strings as the associated values, i.e. the search for a ';' will always succeed. This reduces code size as no tests for ';' not found are mandated.

A case can be made too that since AM and PM are abbreviations of Latin, this is invariant across all locales also for a Gregorian calendar, with the am_pm value used only to specify if lower, mixed, or upper case versions of those are preferred; there should never be zero length values for either. Even where a country prefers to use a national languge translation for "before noon" or "after noon", an appropriate abbreviation could be represented, e.g. "VM";"AM" representing "vor mittag" and "ab mittag" in a german locale.

As such %r is valid for all locales too, never unsupported, and a default string of "%I:%M%p" for t_fmt_ampm is appropriate. It is a more a defining specification of locale data does not say an alternate format is required. If a locale definition says always use 24 hour values, or always provide seconds as in the POSIX locale, it's then on that locale data to specify a string like "%H%M" or "%I:%M:%S %p" (as strftime() does), not leave it empty.

Even though adding the string to values of a 24 hour usage is nominally superfluous, it still accents the value is a morning or evening time so isn't entirely redundant. I wouldn't preclude values like "%H:%M %p" therefore.

(0004762)
Don Cragun   
2020-01-30 16:57   
Interpretation response
------------------------
The standard is unclear on this issue, and no conformance distinction can be made between alternative implementations based on this. This is being referred to the sponsor.

Rationale:
-------------
The standard states that an empty t_fmt_ampm string indicates that the 12-hour format is not supported in the locale, but it is unclear how other related parts of the standard are affected in this case.

Notes to the Editor (not part of this interpretation):
-------------------------------------------------------
On page 160 line 5085 section 7.3.5.1 LC_TIME Locale Definition, change:
The operand shall consist of two strings, separated by a <semicolon>, each surrounded by double-quotes. The first string shall represent the ante-meridiem designation, the last string the post-meridiem designation.
to:
The operand shall consist of two strings, separated by a <semicolon>, each surrounded by double-quotes; the first string shall represent the ante-meridiem designation, the last string the post-meridiem designation. If and only if the 12-hour format is not supported in the locale, both strings shall be empty.


On page 160 line 5092 section 7.3.5.1 LC_TIME Locale Definition, change
If the string is empty, the 12-hour format is not supported in the locale.
to:
If and only if the 12-hour format is not supported in the locale, the string shall be empty.


On page 162 line 5150 section 7.3.5.2 LC_TIME C-Language Access, change:
    
AM_STR The appropriate ante-meridiem affix.


    PM_STR The appropriate post-meridiem affix.


    T_FMT_AMPM The appropriate time representation in the 12-hour clock format with AM_STR and PM_STR.
to:
AM_STR The appropriate ante-meridiem affix; if AM_STR and PM_STR are both empty strings, the 12-hour format is not supported in the locale.


    PM_STR The appropriate post-meridiem affix; if AM_STR and PM_STR are both empty strings, the 12-hour format is not supported in the locale.


    T_FMT_AMPM The appropriate time representation in the 12-hour clock format; if the 12-hour format is not supported in the locale, this shall be either an empty string or a string specifying a 24-hour clock format.


On page 266 line 8855 section <langinfo.h>, change:
a.m. or p.m. time format string.
to:
Time format string using 12-hour clock format, if supported in the locale; if the 12-hour format is not supported, this shall be either an empty string or a string specifying a 24-hour clock format.


On page 266 line 8856 section <langinfo.h>, change:
Ante-meridiem affix.
to:
Ante-meridiem affix; if AM_STR and PM_STR are both empty strings, the 12-hour format is not supported in the locale.


On page 266 line 8857 section <langinfo.h>, change:
Post-meridiem affix.
to:
Post-meridiem affix; if AM_STR and PM_STR are both empty strings, the 12-hour format is not supported in the locale.


On page 1020 lines 34774-34775 section getdate(), change:
The locale’s appropriate representation of time in AM and PM notation.
to:
The locale’s appropriate representation of time in 12-hour clock notation, if the 12-hour format is supported in the locale (see [xref to XBD 7.3.5.1]).


On page 2046 lines 65589-65590 section strftime(), change:
Replaced by the time in a.m. and p.m. notation; in the POSIX locale ...
to:
Replaced by the time in 12-hour clock notation; [CX]if the 12-hour format is not supported in the locale, this shall be either an empty string or the time in a 24-hour clock notation[/CX]. In the POSIX locale ...


On page 2064 line 66161 section strptime(), change:
12-hour clock time using the AM/PM notation if t_fmt_ampm is not an empty string in the LC_TIME portion of the current locale ...
to:
12-hour clock time, if the 12-hour format is supported in the locale (see [xref to XBD 7.3.5.1]) ...


On page 2474 line 79408 section at, change:
An AM/PM indication (one of the values from the am_pm keywords in the LC_TIME locale category) can follow the time
to:
If the LC_TIME category of the locale supports 12-hour time format (see [xref to XBD 7.3.5.1]), an AM/PM indication in the form of one of the values from the am_pm keywords in the LC_TIME locale category can follow the time


The following change is not needed if this bug is applied after bug 466 (which replaces the conversion specifiers with a reference to strftime()).
On page 2635 lines 85710-85711 section date, change:
12-hour clock time [01,12] using the AM/PM notation; in the POSIX locale ...
to:
12-hour clock time notation; if the 12-hour format is not supported in the locale, this shall be either an empty string or the time in a 24-hour clock notation. In the POSIX locale ...
(0004767)
ajosey   
2020-02-03 21:14   
Interpretation Proposed: 3 February 2020
(0004800)
ajosey   
2020-03-23 15:24   
Interpretation Approved: 23 March 2020




Viewing Issue Advanced Details
1306 [1003.1(2016)/Issue7+TC2] Shell and Utilities Editorial Enhancement Request 2019-12-16 22:38 2020-02-26 11:45
steffen
 
normal  
Applied  
Accepted  
   
Steffen Nurpmeso
Vol. 3: Shell and Utilities, mailx
2951
97809
Approved
See Note: 0004716.
Documented folder= behaviour contradicts implementations (of folders command)
POSIX documents

  If folder is unset or set to null, user-specified filenames beginning with '+' shall refer to files in the current directory that begin with the literal '+' character.

This does not reflect behaviour of BSD Mail (since introduction in 1982-03-15), which simply calls getfold() of same commit, and here NOSTR is NULL indeed:

<code>
+ if (name[0] == '+' && getfold(cmdbuf) >= 0) {
+ sprintf(xname, "%s/%s", cmdbuf, name + 1);
+ return(expand(savestr(xname)));
+ }

+getfold(name)
+ char *name;
+{
+ char *folder;
+
+ if ((folder = value("folder")) == NOSTR)
+ return(-1);
+ if (*folder == '/')
+ strcpy(name, folder);
+ else
+ sprintf(name, "%s/%s", homedir, folder);
+ return(0);
+}
</code>

Unix V10 (V8 has not any of this) derives from the above, but .. maybe has had time pressure and did the false

<code>
        if (name[0] == '+') {
                cp = expand(++name);
                if (*cp != '/' && getfold(cmdbuf) >= 0) {
                        sprintf(xname, "%s/%s", cmdbuf, cp);
                        cp = savestr(xname);
                }
                if (debug) fprintf(stderr, "%s\n", cp);
                return cp;
</code>

Since the folders command simply provides a listing of the directory represented by folder= a.k.a. getfold()>=0 (done like that by all), i see no ground for POSIX's "or set to null".
On line 97814, change

  nofolder. If folder is unset or set to null, user-specified filenames

to

  nofolder. If folder is unset, user-specified filenames
Notes
(0004716)
Don Cragun   
2020-01-09 17:32   
Interpretation response
------------------------
The standard states that a null option-argument to the folders command must produce an error, and conforming implementations must conform to this. However, concerns have been raised about this which are being referred to the sponsor.

Rationale:
-------------
This is not the way existing implementations behave.

Notes to the Editor (not part of this interpretation):
    Make the changes suggested in the Desired Action.
(0004717)
agadmin   
2020-01-10 17:43   
Interpretation proposed: 10 January 2020
(0004785)
ajosey   
2020-02-19 17:27   
Interpretation approved: 19 Feb 2020




Viewing Issue Advanced Details
1304 [1003.1(2016)/Issue7+TC2] Shell and Utilities Editorial Enhancement Request 2019-11-27 11:15 2020-05-19 11:04
joerg
 
normal  
Applied  
Accepted As Marked  
   
Jörg Schilling
C99
2543
82297-8298
---
See Note: 0004833
Align c99 -o with reality, the standard should not be more restrictive than implementations
The text:

-o outfile
    Use the pathname outfile, instead of the default a.out, for the executable file produced. If the -o option is present with -c or -E, the result is unspecified.

Does not reflect the behavior of C-compilers since more than 35 years.

The option -o always works with sall known compiler implementations together with -c. It is important to be able to tell the compiler to directly create a named output file for a c99 -c xx.c compilation in order to avoid file clobbering with concurrent compilations that would be the result from compiling to a standard file name and then being forved to rename the output file.
Change:

-o outfile
    Use the pathname outfile, instead of the default a.out, for the executable file produced. If the -o option is present with -c or -E, the result is unspecified.

to:

-o outfile
    Use the pathname outfile, instead of the default a.out, for the executable file produced. If the -o option is present with -E, the result is unspecified.
Notes
(0004670)
geoffclare   
2019-11-27 15:53   
Your claim that -o "always works with all known compiler implementations together with -c" is incorrect. It does not work with the HP-UX compiler.

Also, the requested change is both wrong and insufficient.

It is wrong because the unchanged part "the default a.out, for the executable file produced" makes no sense when -c is used.

It is insufficient because removing the statement that the behaviour is unspecified does not magically make it become specified. You need to propose additional wording to specify the behaviour, and it needs to cover these points:

1. What happens if you specify -c and -o together with more than one .c file

2. What happens if the -o option-argument specifies a directory

(With some compilers, cc -o dir -c file1.c file2.c creates dir/file1.o and dir/file2.o)
(0004671)
joerg   
2019-11-27 16:04   
(edited on: 2019-11-27 16:07)
Your claim about the HP-UX compiler is wrong, see the related rules in the Schily Makefilesystem that work fine on HP-UX and all other platforms that are relevant.

This even works for "cl.exe"... you just need to use -Fo instead.

Regarding finding a better wording: you are of course welcome!

(0004715)
Don Cragun   
2020-01-09 17:15   
(edited on: 2020-01-09 17:38)
In the Options, change P2543, L82297-82298 from:

    
-o outfile
Use the pathname outfile, instead of the default a.out, for the executable file produced. If the -o option is present with -c or -E, the result is unspecified.
to:
-o outfile
Name the file produced by the link-editor outfile, instead of the default a.out. If a single object file is being produced (by using -c with a single input file), name the object file outfile instead of the default file.o. If the -o option is present with -E, the result is unspecified.

Add the following new paragraphs to Rationale after P2551, L82609:
<tt>c99 -c -o ...</tt> is frequently used to directly place the .o file into an alternative directory without a need to separately rename the output file. This helps to support concurrent compilations and out of tree builds.

Some implementations allow -c -o directory to produce directory/file.o even when there is more than one input file; however, portable applications using -c with -o must compile only one file at a time and must specify the final destination filename rather than a directory.



(0004833)
geoffclare   
2020-04-30 14:31   
(edited on: 2020-04-30 15:08)
Reopening because the changes in Note: 0004715 clash with the -o changes in bug 0001294. Also, a needed change to the DESCRIPTION was missed.

New proposed changes...

On page 2542 line 82232 section c99, change:
If the -c option is specified, for all pathname operands
to:
If the -c option is specified and the -o option is not specified, for all pathname operands

On page 2543 line 82297-82298 section c99, after applying bug 1294 change:
Name the output file to be produced. If the -o option is present with -c or -E, the result is unspecified.
to:
Name the output file to be produced. If the -o option is present with -E, or with -c and more than one input file, the result is unspecified.

When creating a single object file (by using -c with a single input file), use the pathname outfile, instead of the default file.o, for the object file produced.

Add the following new paragraphs to Rationale after P2551, L82609:
<tt>c99 -c -o ...</tt> is frequently used to directly place the .o file into an alternative directory without a need to separately rename the output file. This helps to support concurrent compilations and out of tree builds.

Some implementations allow -c -o directory to produce directory/file.o even when there is more than one input file; however, portable applications using -c with -o must compile only one file at a time and must specify the final destination filename rather than a directory.







Viewing Issue Advanced Details
1302 [1003.1(2016)/Issue7+TC2] Base Definitions and Headers Objection Enhancement Request 2019-11-19 15:27 2019-11-20 15:42
geoffclare
 
normal  
New  
Open  
   
Geoff Clare
The Open Group
(many)
(many)
(many)
---
Alignment with C17
In Issue 8, align the standard with C17.

Note this bug has XBD as the category, but the changes affect all volumes.
Apply the changes detailed in the attached document (whichever version is the latest at the time this bug is resolved - it is currently a work in progress).
C17_alignment_20191119.pdf (477 KB) 2019-11-19 15:27
Notes
(0004663)
shware_systems   
2019-11-19 17:04   
Editorial:
Line 741, noreturn shouldn't be plain text; either bold or tt-tagged.
A few macro definitions like this follow for other keyword equivalencies.
Having these and other macro definitions as run on sentences is also inconsistent with other headers where each identifier is on a separate line.

Objection:
The text does not carry into pthread_mutex_timedlock() the restriction with mtx_timedlock() that the initialization of the mtx_t value used specifies mtx_timed for it to return thrd_success. This requires modifying pthread_mutexattr_get/settype() or adding (sic) pthread_mutexattr_get/settimed() or ...setuntimed() as interfaces. The latter is preferable, even though this qualifies currently as invention, as the default for pthread_mutex_t currently is that they be usable with pthread_mutex_timedlock(), i.e. mtx_timed is presumed set. This is the one place where <threads.h> supersets <pthreads.h> in functionality, and I see it as on <pthreads.h> to change to accommodate this if the intent is all of <threads.h> be implementable as wrappers of <pthreads.h> interfaces.
(0004664)
geoffclare   
2019-11-20 09:16   
Re: Note: 0004663

> Line 741, noreturn shouldn't be plain text

If you look at other header pages you'll see that all macro names are plain text (except for function-like macros). If you have a problem with that, this is not the place to raise it.

> Having these and other macro definitions as run on sentences is also inconsistent with other headers where each identifier is on a separate line.

I will fix that for <threads.h> at line 794. If you found a similar problem elsewhere please let me know the line number(s) by email.

> The text does not carry into pthread_mutex_timedlock() the restriction with mtx_timedlock() that the initialization of the mtx_t value used specifies mtx_timed for it to return thrd_success.

I'll respond to this on the mailing list, to avoid starting a discussion in these bug notes. This is a long document, so if we are not careful we will end up with far too many notes in this bug.
(0004665)
dennisw   
2019-11-20 10:49   
lines 2741-2830
The changes to fopen proposed here overlap with the changes to fopen accepted for 0000411.
These changes should either be rewritten to apply on top of 0000411 or the resolution to 0000411 should be changed to apply on top of these changes.

lines 4323-4325
"IEC 60559 implementations that support <complex.h>" does not seem correct to me.
An implementation can conform to Annex F of the C standard and provide the <complex.h> header without conforming to Annex G of the C standard.
Then that would be a conforming IEC 60559 implementation that supports <complex.h> but that does not implement the functionality specified by the MXC margin code.
(0004666)
geoffclare   
2019-11-20 14:07   
Re: Note: 0004665

> The changes to fopen proposed here overlap with the changes to fopen accepted for 0000411.

Thanks, I will update my proposed changes.

> "IEC 60559 implementations that support <complex.h>" does not seem correct to me.

I see your point. The XBD 1.7.1 addition already says "The functionality described is mandated by the ISO C standard only for implementations that define __STDC_IEC_559_COMPLEX__" so I think that second sentence in the XRAT A.1.7.1 addition could just be dropped.




Viewing Issue Advanced Details
1301 [1003.1(2016)/Issue7+TC2] System Interfaces Editorial Clarification Requested 2019-11-15 22:26 2019-12-19 18:04
dmitry_goncharov
 
normal  
Resolved  
Accepted As Marked  
   
Dmitry Goncharov
glob
1109
37544-37545
---
See Note: 0004692 in 0001300.
clarify glob("/", GLOB_MARK, ...) behavior
The current standard says
"GLOB_MARK
Each pathname that is a directory that matches pattern shall have a <slash> appended."

When pattern is a "/", "./", "../", "//", "///", etc a conformant implementation would return "//", ".//", "..//", "///", "////" respectively.
This is hardly useful.

I checked various implementation and they all behave differently.

Given "/" glibc produces "//", sunos produces "//", aix produces "/".
Given "//" glibc produces "//", sunos produces "///", aix returns GLOB_NOMATCH.
Given "///" glibc produces "//", sunos produces "////", aix returns GLOB_NOMATCH.
Given "/tmp/" glibc produces "/tmp/", sunos produces "/tmp//", aix produces "/tmp/".
Given "/tmp//" glibc produces "/tmp/", sunos produces "/tmp///", aix produces "/tmp/".
Given "/tmp///" glibc produces "/tmp/", sunos produces "/tmp////", aix produces "/tmp/".
Replace
"Each pathname that is a directory that matches pattern shall have a <slash> appended."

With
"Each pathname that is a directory that matches pattern shall have a <slash> appended, unless the pathname already has a trailing slash."
Notes
(0004659)
hvd   
2019-11-15 22:59   
Bug 1300 and bug 1301 asked two different questions, even if they were both about GLOB_MARK. Bug 1300 asked whether symlinks to directories should be treated the same as directories. Bug 1301 asked whether pathnames that already end in a slash should have another slash appended. I am noting this so that bug 1300's question does not get overlooked now that it has been closed as a duplicate of this one.
(0004693)
Don Cragun   
2019-12-19 17:53   
This bug and 0001300 were discussed together during the 2019-12-19 conference call. The issues raised by both bugs are fixed in Note: 0004692.




Viewing Issue Advanced Details
1300 [1003.1(2016)/Issue7+TC2] System Interfaces Editorial Clarification Requested 2019-11-15 22:14 2020-05-05 14:51
dmitry_goncharov
 
normal  
Applied  
Accepted As Marked  
   
Dmitry Goncharov
glob
1109
37544-37545
Approved
See Note: 0004692.
clarify GLOB_MARK behavior
The current standard says
"GLOB_MARK
Each pathname that is a directory that matches pattern shall have a <slash> appended.

Please clarify if a symlink to a directory should also have slash appended.
I checked various implementations and they all append slashes to both directories and symlinks to directories.
It makes sense to append a slash to symlinks to directories, because the user uses such symlinks as directory names.
Replace

"Each pathname that is a directory that matches pattern shall have a <slash> appended."

with

"Each pathname that is a directory or a symlink to a directory that matches pattern shall have a <slash> appended."
Notes
(0004658)
Don Cragun   
2019-11-15 22:40   
(edited on: 2019-11-15 22:42)
0001300 and 0001301 are about the same topic, submitted by the same submitter, and appeared within minutes of each other.

0001301 provides more detail in the description of the problem. This bug is closed.

(0004660)
Don Cragun   
2019-11-15 23:14   
On closer examination, 0001300 and 0001301 ask different questions. This bug is reopened.
(0004661)
geoffclare   
2019-11-18 15:47   
The standard requires that symbolic links are always followed except where it explicitly states otherwise. See XBD 4.13 Pathname Resolution (the paragraph starting "If a symbolic link is encountered ...")

Since there is no explicit statement in the description of glob() to say that it does not follow symbolic links when determining whether a pathname is a directory, the default rule from XBD 4.13 applies and glob() is therefore required to follow them; no change to the standard is needed.

This bug should be rejected.
(0004662)
stephane   
2019-11-19 07:32   
Still, since it's at odds with ls -p or the markdirs option of zsh, it wouldn't harm mentioning again in a non-normative section of the glob() spec.

(completion listings of most shells (busybox, bash, ksh, Byron's rc, yash, fish at least) work like GLOB_MARK though. csh, tcsh and zsh use a @ suffix à la ls -F).

In any case, the wording has a related slight problem: if the current directory is readable, but not searchable, for the expansion of *, an implementation cannot tell whether an entry is a symlink to a directory or to some other file. They may be able to tell if an entry is a directory or not if that information is stored in the directory (d_type of struct dirent returned by readdir() for instance).

The "is a directory or a symlink to a directory" wording is ambiguous for files that are a symlink to symlink to a directory.

So it should probably be something like:

"Each pathname that matches pattern and that is determined to be a directory after symlink resolution shall have a <slash> appended."

(whether the inability to "mark" a directory should be flagged as an error with GLOB_ERR was mentioned on the ML or a different bug not too long ago IIRC, but I don't remember the outcome).
(0004692)
Don Cragun   
2019-12-19 17:36   
(edited on: 2019-12-20 10:06)
Note that the changes specified below address the issues raised in this bug report and in 0001301.

Interpretation response
------------------------
The standard states "Each pathname that is a directory that matches pattern shall have a <slash> appended.", and conforming implementations must conform to this. However, concerns have been raised about this which are being referred to the sponsor.

Rationale:
-------------
Requiring the pathname "/" to be converted to "//" and the pathname "//" to be converted to "///" was not considered when this issue was first standardized.

Notes to the Editor (not part of this interpretation):
-------------------------------------------------------
On page 1109 line 37544 section glob(), change:
    
Each pathname that is a directory that matches pattern shall have a <slash> appended.
to:
For each pathname that matches pattern and is determined to be a directory after pathname resolution, process the pathname so the result is as if the following steps are applied in order:

         
  1. If the pathame is <slash>, do not modify the pathname and skip the remaining steps.

  2.      
  3. If the pathname is <slash><slash> and the implementation handles pathname resolution of a pathname starting with exactly two successive <slash> characters differently than it handles a pathname starting with only a single <slash>, do not modify the pathname and skip the remaining steps.

  4.      
  5. If the pathname does not end with a <slash>, append a <slash> to the pathname and skip the remaining steps.

  6.      
  7. A <slash> may be appended to the pathname.

  8.      
  9. If there are multiple <slash> characters at the end of the pathname, all but one of those trailing <slash> characters may be removed from the pathname.


Append the following new paragraphs to the glob() rationale after P1112, L37648:
    
Earlier versions of this standard defined the behavior associated with the flag GLOB_MARK as:
        
Each pathname that is a directory that matches pattern shall have a <slash> appended.

This was undesirable if the matched pathname was <slash> or if the matched pathname was <slash><slash> and the implementation treats a leading <slash><slash> differently than it treats a pathname with a single leading <slash>. Only a few implementations were known to conform to this requirement (maybe only one) and there was a lot of variation in the way other implementations behaved. The current wording allows many of the alternative behaviors that were observed, except that the pathnames "/" and "//" (if it is treated differently than "/") must not be modified.

Implementations should consider the following much simpler requirement (which is allowed by the current standard) when processing the GLOB_MARK flag:
        

Each pathname that matches pattern, is determined to be a directory after pathname resolution, and does not end with a <slash> shall have a <slash> appended.


(0004710)
ajosey   
2020-01-07 13:37   
Interpretation proposed: 7 January 2020
(0004784)
ajosey   
2020-02-19 17:26   
Interpretation approved: 19 Feb 2020
(0004857)
geoffclare   
2020-05-05 14:51   
When applying this bug it occurred to me that the uses of "shall" in the rationale were a problem. I therefore used quotation marks instead of indentation, so that the uses of "shall" are contained in quotations of (what would be) normative text.




Viewing Issue Advanced Details
1299 [1003.1(2016)/Issue7+TC2] Base Definitions and Headers Editorial Error 2019-11-11 10:55 2020-01-29 15:09
geoffclare
 
normal  
Applied  
Accepted  
   
Geoff Clare
The Open Group
<netinet/in.h>
306
10365,10366
---
netinet_in.h should be netinet/in.h
The <netinet/in.h> page has a typo, in two places, where it is referred
to as <netinet_in.h>.
Change both occurrences of <netinet_in.h> to <netinet/in.h>.
There are no notes attached to this issue.




Viewing Issue Advanced Details
1298 [1003.1(2016)/Issue7+TC2] Shell and Utilities Objection Clarification Requested 2019-11-01 16:16 2020-01-29 15:06
geoffclare
 
normal  
Applied  
Accepted  
   
Geoff Clare
The Open Group
ed
2689
87713
---
ed CONSEQUENCES OF ERRORS unclear about diagnostic message
The CONSEQUENCES OF ERRORS section on the ed page says:
If the standard input is a regular file, ed shall terminate with a non-zero exit status.

Notice that it does not say a diagnostic message needs to be written.

Section 1.4 Utility Description Defaults, under CONSEQUENCES OF ERRORS, says:
The following shall apply to each utility, unless otherwise stated:

[...]

* When an unrecoverable error condition is encountered, the utility shall exit with a non-zero exit status.

* A diagnostic message shall be written to standard error whenever an error condition occurs.

It is possible to interpret the statement on the ed page as triggering the "otherwise stated" here, since it states something that is already one of the bullet items in the defaults and thus can be seen as a replacement for the whole bullet list. This is almost certainly not intended (if it was, the ed page would explicitly say that a diagnostic message need not be written), but it would be good to remove any doubt.

The ex page has a similar problem.

Another problem with above quote from the ed page is that it only covers the case where standard input is a regular file. It should apply to other non-terminal file types (e.g. pipes) as well, like it does for ex.
On page 2689 line 87713 section ed, change:
If the standard input is a regular file, ed shall terminate with a non-zero exit status.
to:
If the standard input is a not a terminal device file, ed shall behave as described under CONSEQUENCES OF ERRORS in [xref to 1.4].

On page 2743 line 89810 section ex, change:
When any error is encountered and the standard input is not a terminal device file, ex shall not write the file or return to command or text input mode, and shall terminate with a non-zero exit status.
to:
When any error is encountered and the standard input is not a terminal device file, in addition to the default requirements described in [xref to 1.4], ex shall neither write the file (if one has been opened) nor return to command or text input mode.
There are no notes attached to this issue.




Viewing Issue Advanced Details
1296 [1003.1(2016)/Issue7+TC2] System Interfaces Objection Error 2019-10-06 15:54 2019-12-17 16:20
dennisw
 
normal  
Applied  
Accepted  
   
Dennis Wölfing
tmpfile
2162
69312-69313
---
EOVERFLOW does not make sense for tmpfile
The errors section of tmpfile contains an EOVERFLOW error for when the file size does not fit into an off_t.
However this does not make sense because tmpfile creates a new empty file.
On page 2162 line 69312-69313 section tmpfile, delete:
[EOVERFLOW]
The file is a regular file and the size of the file cannot be represented correctly
in an object of type off_t.
Notes
(0004618)
Don Cragun   
2019-10-14 16:22   
(edited on: 2019-10-14 16:37)
The EOVERFLOW error condition shouldn't just be for the size of the newly created file, it should also apply to the size of the directory in which the file being created (which frequently increases in size when a new directory entry is added). Although it would be unusual for a directory to grow to that point, there is no reason to think that it would be impossible.

(0004619)
shware_systems   
2019-10-14 20:30   
From the tmpfile() Change History:
In the ERRORS section, the [EOVERFLOW] condition is added. This change is to support large files.

From this I construe its presence as: while the "wb+" flags requirement ensures the length is initially zero, this can be considered a warning, more than error, that any persistence of the file is on media that permits the file to grow past what ofs_t supports for the compiler model used to compile the application. The text is not explicit about whether maximum or current size was intended, so this may be off. Whether this possibility should preclude a file being opened anyways I can see as arguable; it's more on an application to keep the file from growing that large.

As Don notes, this may apply to files of type directory too on some implementations, not regular files only; that the act of creating the file grows the directory past that limit - I'd hope a separate error number would indicate this, however, not overload EOVERFLOW.
(0004621)
geoffclare   
2019-10-15 09:59   
(edited on: 2019-12-12 17:18)
Re: Note: 0004618 It is clear from the way the EOVERFLOW condition is worded that it does not apply to the directory in which the file is being created (emphasis added):
The file is a regular file and the size of the file cannot ...

I believe the presence of EOVERFLOW on the tmpfile() page is the result of an editorial error when applying the changes in the Large File Summit white paper during work on Issue 5. The relevant section (taken from http://www.opengroup.org/platform/lfs.html [^] ) is:

2.2.1.9 fopen(), freopen(), tmpfile()

DESCRIPTION
The largest value that can be represented correctly in an object of type off_t will be established as the offset maximum in the open file description.

ERRORS
The fopen() and freopen() functions will fail if:

[EOVERFLOW]
The named file is a regular file and the size of the file cannot be represented correctly in an object of type off_t.

Notice that the section heading includes tmpfile() but the error condition text is only for fopen() and freopen(). It is easy to see how the editor could have mistakenly applied the error condition change to tmpfile() as well.

(0004623)
joerg   
2019-10-15 10:25   
(edited on: 2019-10-15 10:26)
If we refer to the large file summit, it may be useful to add:


EOVERFLOW
      The file is a regular file and at least on of the time stamps cannot be represented correctly in an object of type time_t

...since this has been slipped in the large file summit: The LFS struct stat should have introduced a 64 bit time_t object as well.





Viewing Issue Advanced Details
1294 [1003.1(2016)/Issue7+TC2] Shell and Utilities Comment Omission 2019-10-01 08:03 2020-04-29 15:30
Konrad_Schwarz
 
normal  
Applied  
Accepted As Marked  
   
Konrad Schwarz
Siemens AG
c99
https://pubs.opengroup.org/onlinepubs/9699919799/utilities/c99.html [^]
https://pubs.opengroup.org/onlinepubs/9699919799.2008edition/utilities/c99.html#tag_20_11_13_03 [^]
---
Note: 0004686
POSIX recognizes the existence of dynamically loadable, executable object files, but provides no way of producing them.
The interface specified in https://pubs.opengroup.org/onlinepubs/9699919799.2008edition/basedefs/dlfcn.h.html [^] allows run-time access to "executable object files", but I could not find
a conforming way to produce them.
Add _POSIX_SHARED_OBJECT and _POSIX_V8_SHARED_OBJECT_{CFLAGS,LDFLAGS} options to getconf analogously to the "Threaded Programming Environment" table
in c99.

For the GCC toolchain, I would expect _POSIX_V8_SHARED_OBJECT_CFLAGS to be
-fpic and ..._LDFLAGS to be -shared.
Notes
(0004586)
GarrettWollman   
2019-10-01 17:14   
There is no portable way to generate dynamically loadable object files because some dynamic linker implementations require that an exhaustive list of exported symbols be provided by the programmer (distinct from the symbol table in the object files), or require other similar interface information that cannot be deduced from the object files alone.
(0004587)
shware_systems   
2019-10-01 20:05   
From c99 Description:
"If there are no options that prevent link editing (such as -c or -E), and all input files compile and link without error, the resulting executable file shall be written according to the -o outfile option (if present) or to the file a.out."

When this is true, what makes such an executable file a program is the presence of main(), either from a provided object or static library, or the one in standard libraries libl or liby. If main() is not found the file is a library that is expected to be useable by dlopen() the same as dlopen(NULL, flags) references the current program file.

This is portable without needing any extra command line flag. The wording of dlfcn.h and interfaces could emphasize better that what is allowed there as implementation-defined is in addition to the above expectation, not a substitute.

Also, if an external linker is used, it is the responsibility of the compilation phase to synthesize from the input .o and .c sources the information that linker requires. If the compiler cannot do this, necessitating the maintenance of files like an exports list, they are disqualified from being considered conforming, and the object format is suspect for being insufficently sophisticated to begin with. The caveat in the Input Files section for c99 that additional formats may be defined I see as for optional data, not for any additional requirements for producing a valid executable.
(0004589)
alanc   
2019-10-01 22:22   
For platforms which support multiple compilers, how would you specify to getconf which compiler you want the flags for? For instance, on Solaris, _POSIX_V8_SHARED_OBJECT_LDFLAGS would need to be -G for Studio compilers but -shared for GNU & LLVM compilers.
(0004591)
Konrad_Schwarz   
2019-10-02 07:38   
(edited on: 2019-10-17 16:13)
Re Note: 0004586: are these dynamic linker implementations relevant to POSIX? (Can POSIX assume an "ELF-only" world at this point?).

But this is a fair point -- I recognize that any serious engineering
organization will invest in far more in-depth handling of toolchain
issues that the generic getconf interface offers. OTOH, this is
true for c99 in general, so conversely, should c99 (or any hypothetical
successors like c11) be retired?

I would argue no; I think it is beneficial to have a standardized
(fairly simple) interface to the compilation tool chain, which,
at the least, gives you a starting point for platform (or toolchain)
dependent optimizations.

Re Note: 0004589: You would use whatever non-standard mechanism
that platform offers -- if any -- to map the standard c99 utility
to one of the compilers the platform supposedly supports.

(0004625)
eblake   
2019-10-17 16:07   
Re: Note: 0004591 asking about ELF-only. The Cygwin platform tries to comply with POSIX (where possible) but uses PE-COFF not ELF (thanks to the underlying Windows OS). And yes, Cygwin has support for loadable libraries. Thus, POSIX cannot assume ELF-only.
(0004634)
Konrad_Schwarz   
2019-11-05 11:20   
(edited on: 2019-11-05 11:34)
Re: Note: 0004625: on purely formal grounds, I think Cygwin doesn't count,
as it is not a POSIX system.

However, the GNU ld --export-all-symbols option would seem to do the right thing, even on Cygwin.

So Cygwin could probably support this proposal.

(Also, it is quite likely that any system supporting libtool could support
this extension).

(0004635)
shware_systems   
2019-11-05 17:05   
Cygwin does count, as do the apps created for Windows POSIX add-on, as the internals of .o files and application or library files linked from these, as a.out modules, are left unspecified. As such, COFF and OMF are as valid as ELF in that these all maintain associations of symbolic names with object or function addresses that a dlsym() implementation can reference.

This said, the consensus in the phone call discussions is an option like --export-all-symbols is missing from the c99 utility that ensures all names from source .o files with extern scope are represented in an a.out module, whether a main() declaration is processed or not. Addition of this switch is seen as more forward compatible than the Desired Action, and is using the Solaris cc compiler's -G switch as example/model of existing practice. See etherpad for details.
(0004636)
joerg   
2019-11-05 17:10   
From my understanding, the option --export-all-symbols is not relevant for us as long as we do not standardize linker map files.

My impression is that --export-all-symbols is a workaround, since on UNIX, all global symbols remain global unless there is a linker map file.
(0004637)
shware_systems   
2019-11-05 17:41   
yes, map and listing files of any sort are unspecified, as are linker utilities to begin with as unnecessary. That many toolchains include a separate linker is their election, but only needed if a compiler other than c99 is provided that requires it's use. As an option name it is descriptive of functionality c99 is seen as lacking, that's all, not that it is the model for removing that lack.
(0004638)
Konrad_Schwarz   
2019-11-06 08:27   
(edited on: 2019-11-06 10:09)
Re: Note: 0004636 and Note: 0004637

I think you are missing the point here: the key difference
in Windows PE-COFF DLLs vs. ELF shared objects is that
DLLs have an explicit symbol export list, whereas shared objects
export everything by default. Obviously, there are various ways
of changing this, but I am talking about the default case here.

Now with DLLs, you either use a separate file
(a module definition file---not an linker map file)
to specify which symbols
are exported, or you mark up your code with declspec(dllexport), etc.
POSIX does not want you to do that, because ELF shared objects don't
require it: by default they export all non-static symbols, just as
in the static linking case.

However, the GCC toolchain for PE-COFF targets provides the
--export-all-symbols flag
which basically replicates the shared object behavior for PE-COFF DLLs.
It also supports "direct linking to a DLL" (see the manual),
so the .lib import file used by Microsoft toolchains is not required.
So the toolchain flow for GCC for a PE-COFF-based target turns out
to be pretty much identical to the ELF case.

Hence, an implementation of getconf on Cygwin can return
-Xl,--export-all-symbols or similar in the proposed
_POSIX_V8_SHARED_OBJECT_LDFLAGS tag.

Just to make this clear:
-Xl,--export-all-symbols is a compiler flag that gcc
and its POSIX-specified alter ego c99 understand.

Finally, note that GCC ld will automatically assume
--export-all-symbols
if no symbols would otherwise be exported. This means that
the "best" solution would probably be for
_POSIX_V8_SHARED_OBJECT_LDFLAGS to not include
-Xl,--export-all-symbols at all.

(0004686)
geoffclare   
2019-12-13 10:06   
(edited on: 2019-12-13 10:18)
The following changes were agreed in the 2019-12-12 teleconference.

When the c99 page is converted to a c17 or c2x page for Issue 8, make additional changes as if the following had been applied to the c99 page before the conversion.

On page 2540 line 82224 section c99, change:
<tt>[-L directory] [-l library]</tt>
to:
<tt>[-L directory] [-l library] [-R directory]</tt>

On page 2542 line 82227 section c99, change:
The system conceptually consists of a compiler and link editor.
to:
The system conceptually consists of a compilation phase, encompassing Translation Phases 1 through 7 of the ISO C standard, and a linkage phase, for handling Phase 8 of the ISO C standard and extensions described here. In addition, the compilation phase can be split into a separate preprocessing operation, handling Translation Phases 1 through 4, and a processing operation, handling Phases 5 though 7. Whether a single utility or multiple utilities for handling phases separately is provided by an implementation is left unspecified.

On page 2542 line 82228 section c99, change:
The input files referenced by pathname operands and -l option-arguments shall be compiled and linked to produce an executable file. (It is unspecified whether the linking occurs entirely within the operation of c99; some implementations may produce objects that are not fully resolved until the file is executed.)
to:
The input files referenced by pathname operands and -l option-arguments shall be compiled and linked to produce an executable file or, if the -G option is specified, a shared library file. It is unspecified whether the linking of an executable file occurs entirely within the operation of c99; when a pathname operand or -l option-argument names a shared library, an executable object may be produced that is not fully resolved until the file is executed.

On page 2542 line 82236 section c99, change:
If there are no options that prevent link editing (such as -c or -E), and all input files compile and link without error, the resulting executable file shall be written according to the -o outfile option (if present) or to the file a.out.

The executable file shall be created ...
to:
If there are no options that prevent link editing (such as -c or -E), and all input files compile and link without error, the resulting executable file or shared library file shall be written according to the -o outfile option, if present. If -o outfile is not specified, a resulting executable file shall be written to the file a.out; if the file to be written is a shared library file, the behavior is unspecified.

Executable files shall be created ...

On page 2542 line 82246 section c99, change:
The order of specifying the -L and -l options, ...
to:
The order of specifying the -L, -l and -R options, ...

Add three new options inserted in alphabetic order:
-B mode
If mode is "dynamic", produce a dynamically linked executable file. If the -B option is present with -c, -E or -G, the result is unspecified.
-G
Create a shared library or create object files suitable for inclusion in such a shared library. Compilations shall be performed in a manner suitable for the creation of shared libraries (for example, by producing position-independent code).

If -c is also specified, create object files suitable for inclusion in a shared library.

If -c is not specified, create a shared library. In this case the application shall ensure that the file named by the -o outfile option-argument includes an element named "so" or an implementation-defined element denoting a shared library, where elements in the last component of outfile are separated by <period> characters, for example libx.so.1; if no -o option is included in the options or the file named by the -o outfile option does not contain an element named "so" or an implementation-defined element denoting a shared library, the result is unspecified. If a pathname operand or -l option-argument names a shared library and that shared library defines an object used by the library being created, it shall become a dependency of the created shared library.

If the -G option is present with -B or -E, the result is unspecified.
-R directory
If the object file format supports it, specify a directory to be searched for shared libraries when an executable file or shared library being created by c99 is subsequently executed, or loaded using dlopen(). If directory contains any <colon> or <dollar-sign> characters, the behavior is unspecified. If an implementation provides a means for setting a default load time search location or locations, the -R option shall take precedence.

The directory named by directory shall not be searched by a process performing dynamic loading if either of the following are true:
  • the real and effective user IDs of that process are different and the directory has write permission for a user ID outside the set of the effective user ID of that process and any implementation-specific user IDs used for directories containing system libraries

  • the real and effective group IDs of that process are different and the directory has write permission for group IDs other than the effective group ID of that process.

Directories named in -R options shall be searched in the order specified, before the default system library locations are searched.

If a directory specified by a -R option contains files with names starting with any of the strings "libc.", "libl.", "libpthread.", "libm.", "librt.", [OB]"libtrace.",[/OB] "libxnet.", or "liby.", the result is unspecified.

If the -R option is present with -c or -E, the result is unspecified.

Change the description of the -l library option on P2543, L82286-82290 from:
Search the library named liblibrary.a. A library shall be searched when its name is encountered, so the placement of a -l option is significant. Several standard libraries can be specified in this manner, as described in the EXTENDED DESCRIPTION section. Implementations may recognize implementation-defined suffixes other than .a as denoting libraries.
to:
Search for the library named liblibrary.a or liblibrary.so. When searching for a library, the linker shall look at each directory specified by -L options that appear on the command line before this -l option, in the order given, and then the system default libraries. If liblibrary.a and liblibrary.so both exist in a directory, c99 shall use liblibrary.so if either -B dynamic or -G is specified. Once a library has been found (shared or static) in a directory, later directories in the list shall not be considered. A library shall be searched when its name is encountered, so the placement of a -l option is significant. Several standard libraries can be specified in this manner, as described in the EXTENDED DESCRIPTION section. Implementations may recognize implementation-defined suffixes other than .a and .so as denoting libraries.
(Note to the Editor: The liblibrary.a on P2543, L82286 seems to be a typo. The "library" in that string is the option-argument to the -l option. Therefore, it needs to be in italics as shown in the replacement text above.)

Change the description of the -o outfile option on P2543, L82297-82298 from:
Use the pathname outfile, instead of the default a.out, for the executable file produced. If the -o option is present with -c or -E, the result is unspecified.
to:
Name the output file to be produced. If the -o option is present with -c or -E, the result is unspecified.

When creating an executable file, use the pathname outfile, instead of the default a.out, for the executable file produced.

When creating a shared library, use the pathname outfile as the name of the shared library. If no -o outfile option is specified when creating a shared library, the result is unspecified.

On page 2543 line 82304 section c99, change:
Multiple instances of the -D, -I, -L, -l, and -U options can be specified.
to:
Multiple instances of the -D, -I, -L, -l, -R, and -U options can be specified.

On page 2544 line 82320 section c99, after applying bug 667 change INPUT FILES from:
Each input file shall be one of the following: a text file containing a C-language source program, a text file containing the output of c99 -E, an object file in the format produced by c99 -c, or a library of object files, in the format produced by archiving zero or more object files, using ar. Implementations may supply additional utilities that produce files in these formats. Additional input file formats are implementation-defined.
to:
Each input file shall be one of the following:
  • A text file containing a C-language source program or the output from c99 -E

  • An object file in the format produced by c99 -c

  • A library of object files in the format produced by archiving zero or more object files using ar

  • A shared library in the format produced by c99 -G

Implementations may supply additional utilities that produce files in these formats. Additional input file formats are implementation-defined.

Change the description of file.a on P2544, L82310-82312 from:
A library of object files typically produced by the ar utility, and passed directly to the link editor. Implementations may recognize implementation-defined suffixes other than .a as denoting object file libraries.
to:
A library of static object files typically produced by the ar utility, and referenced during the link-edit phase. Implementations may recognize implementation-defined suffixes other than .a as denoting static object file libraries.

and add a new entry for file.so after P2544, L82315 with the description:
A library of shared object files typically produced by the c99 utility with the -G option, and referenced during the link-edit phase. Implementations may recognize implementation-defined suffixes other than .so as denoting shared object file libraries.

On page 2546 line 82403 section c99, change:
It is unspecified whether the libraries libc.a, libl.a, libm.a, libpthread.a, librt.a, [OB]libtrace.a,[/OB] libxnet.a, or liby.a exist as regular files. The implementation may accept as -l option-arguments names of objects that do not exist as regular files.
to:
The libraries c, l, m, pthread, rt, [OB]trace,[/OB] xnet, and y shall be found as shared libraries when specified as the option-argument to the -l option and may also be found as static libraries but, except for the shared library version of the c library, need not exist as regular files. The implementation may accept as -l option-arguments names of additional implementation-defined libraries that do not exist as regular files.

On page 2550 line 82588 section c99, add new examples:
5. The following example shows how to create a shared library that does not depend on any other shared library:
c99 -G -c foo.c bar.c
c99 -G -o foobar.so foo.o bar.o

6. The following example shows how to create a dynamic executable that loads application specific shared libraries by searching a specified list of directories when it is executed:
c99 -G -c foo.c
c99 -G -o /path/to/dir1/foo.so foo.o
c99 -G -c bar.c
c99 -G -o /path/to/dir2/bar.so bar.o 
c99 -B dynamic -L /path/to/dir1 -L /path/to/dir2 -R /path/to/dir1 \
    -R /path/to/dir2 -o foobar foobar.c -l foo -l bar

On page 2551 line 82609 section c99, add a new paragraph to RATIONALE:
The shared library version of the c library is required to exist as a regular file because the dynamic linker needs to be able to load at least one library at execution time. Other standard shared libraries need not exist in their own right if the interfaces the standard requires them to provide exist in the c library; all that is required is that they are "found" when specified as -l option-arguments. Static versions of the standard libraries need not exist as regular files, even if they are found as static libraries when specified as -l option-arguments.

Add the following new paragraphs to the C99 RATIONALE section after P2551, L82629:
When the -R option is not included when an executable file or shared library is being created, some implementations use the environment variables LD_RUN_PATH and LD_LIBRARY_PATH to determine the directories to be searched for shared libraries.

Some implementations permit placeholders preceded by a <dollar-sign> character ('$'), such as $ORIGIN, in the -R directory option-argument to be evaluated at load time. Some implementations accept a colon separated list of directories for the path to search for shared libraries, with the same effect as specifying the -R option multiple times. However, these features are not universal.

The name of a shared library usually contains an element named "so". Other implementation-defined elements are allowed for backwards compatibility with historical systems, and so that tools can be developed on conforming systems to create libraries for multiple environments. For example, Microsoft systems use the filename extension ".dll" (and do not allow following text) to denote a shared library. The standard allows additional characters to be used in the name of a library following an "so" element to permit shared library versioning information to be at the end of the library filename rather than requiring that any such strings appear before the final element of the library name.

The decision to standardize on "so" as a required element in a shared library name was intentional, as the alternative would have been standardizing things such as a new make macro $(SHLIB_EXT) that would otherwise be needed to write a portable makefile that can compile shared libraries despite not having a standardized element name.

If a combination of direct and indirect dependencies of a shared library would require different versions of another shared library, options that are not specified by the standard (such as -B direct) will probably need to be used when linking that shared library, so that at runtime the intended versions are found.






Viewing Issue Advanced Details
1291 [1003.1(2016)/Issue7+TC2] Base Definitions and Headers Comment Enhancement Request 2019-09-27 19:36 2019-10-10 15:57
joelsherrill
 
normal  
New  
Open  
   
Joel Sherrill
RTEMS.org
pthread.h
NA - addition request
NA - addition request
---
Add method to obtain pthread attributes
This is a commonly added pthread capability but there is little agreement on the API name.

https://musl.openwall.narkive.com/dD88I7eH/pthread-getattr-np [^] provides this list of API names for this capability in the context of discussing how to get the current stack information:

glibc: pthread_getattr_np
freebsd: pthread_attr_get_np
netbsd: pthread_attr_get_np and pthread_getattr_np

RTEMS follows glibc/linux with pthread_getattr_np.


If this capability is provided, then the current stack information can also be obtained.

The naming pattern pthread_[sg]attr_* is used to modify the pthread_attr_t structure used in a pthread_create() call. The Linux name of pthread_attr_get() seems like a choice which is an easy name and wouldn't be confused.
Add pthread_attr_get() API.
Notes
(0004616)
Don Cragun   
2019-10-10 15:57   
(edited on: 2019-10-10 16:04)
This was discussed during the 2019-10-10 conference call: We have no objections to adding such an interface (although it will need a sponsor). But, before we can proceed we need complete details on the interface (i.e., a man page). We note that the man pages for the implementations that currently support this feature have different descriptions and are not really sure what is being proposed.

We suggest using the name pthread_getattr() because it fits better with the existing naming conventions:
pthread_getschedparam(), pthread_getcpuclockid() are functions that obtain information about a specified thread
whereas:
pthread_attr_get*() are functions that extract individual attributes from a pthread_attr_t object


Please supply complete details as a note in this bug report and we will then look for a sponsor for this new interface.

See Note: 0004215 for an example of the information we want to see.





Viewing Issue Advanced Details
1290 [1003.1(2016)/Issue7+TC2] Base Definitions and Headers Editorial Clarification Requested 2019-09-27 19:20 2020-02-26 11:43
joelsherrill
 
normal  
Applied  
Accepted As Marked  
   
Joel Sherrill
RTEMS.org
arpa/inet.h
NA - used web
NA - used web
Approved
See Note: 0004615
arpa/inet.h - origin of socklen_t is unclear
Ref: https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/arpa_inet.h.html [^]

In arpa/inet.h, socklen_t is used as an argument to inet_ntop()
but you have to pull a thread to figure out how it is defined
based on the single include file in the Synopsis. The thread for
.h files is arpa/inet.h -> netinet/in.h -> sys/socket.h.

Clarify the source of socklen_t.
Notes
(0004615)
nick   
2019-10-07 15:46   
Interpretation response
------------------------
The standard states that it is not required to define the socklen_t type in this header, and conforming implementations must conform to this. However, concerns have been raised about this which are being referred to the sponsor.

Rationale:
-------------
The standard permits but does not require the inclusion of the relevant headers to get socklen_t defined, and without this applications could not compile.

Notes to the Editor (not part of this interpretation):
-------------------------------------------------------

Change P222 lines 7468-7469 from

The <arpa/inet.h> header shall define the in_port_t and in_addr_t types as described in <netinet/in.h>.

to

The <arpa/inet.h> header shall define the in_port_t and in_addr_t types as described in <netinet/in.h> and the socklen_t type as defined in <sys/socket.h>.
(0004685)
agadmin   
2019-12-12 17:32   
Interpretation proposed: 12 Dec 2019
(0004690)
dennisw   
2019-12-18 18:59   
I just noticed that the same issue was already reported in 0000606. However that bug was tagged for issue8 which is why it was not fixed in a TC.
I think 0000606 should be closed as a duplicate of this bug.
(0004729)
agadmin   
2020-01-14 16:23   
Interpretation Approved: 14 Jan 2020




Viewing Issue Advanced Details
1289 [1003.1(2016)/Issue7+TC2] Base Definitions and Headers Editorial Clarification Requested 2019-09-27 19:17 2019-11-21 14:59
joelsherrill
 
normal  
Applied  
Accepted As Marked  
   
Joel Sherrill
RTEMS.org
netdb.h
First paragraph
NA - used web
---
Note: 0004614
netdb.h - in_port_t and in_addr_t do not appear to be needed
Ref: https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/netdb.h.html [^]

In netdb.h, both in_port_t and in_addr_t are “may define” but do not appear to be needed. In discussing this, our assumption is that when this header was added to the standard, at least one implementation defined or needed these two types. They do not appear to be strictly needed.
Clarification/update is requested.
Notes
(0004572)
shware_systems   
2019-09-27 19:58   
While not directly referenced, the addrinfo structure and getnameinfo() interface use the incomplete sockaddr type, which in practice will be completed with members of those types for IP4 and IP6 sockets. The current wording is, I strongly suspect, from known implementations all doing a #include of another header that declared these types and completed sockaddr before any of the new declarations and prototypes, to leave open some implementaton may choose not to do it this way. The types are required, however, as sockaddr is required to support IP4, at least.
(0004614)
geoffclare   
2019-10-07 15:22   
On page 302 line 10208 section <netdb.h>, delete:
The <netdb.h> header may define the in_port_t type and the in_addr_t type as described in <netinet/in.h>.




Viewing Issue Advanced Details
1288 [1003.1(2016)/Issue7+TC2] Base Definitions and Headers Objection Error 2019-09-26 10:47 2019-11-21 14:58
geoffclare
 
normal  
Applied  
Accepted  
   
Geoff Clare
The Open Group
9.3.5
186
6180
---
RE Bracket Expression item 8 should not say "rejected as an error"
XBD 9.3.5 item 8 says:
it is unspecified whether the bracket expression will be treated as a collating symbol, equivalence class, or character class, respectively; treated as a matching list expression; or rejected as an error.

The use of "or rejected as an error" here is inappropriate because this section is referenced from XCU 2.13.1 which requires that in pattern matching, a '[' which does not introduce a valid bracket expression is not an error; the '[' is treated as an ordinary character.
Change:
or rejected as an error
to:
or treated as an invalid bracket expression
There are no notes attached to this issue.




Viewing Issue Advanced Details
1287 [1003.1(2016)/Issue7+TC2] System Interfaces Objection Clarification Requested 2019-09-17 10:30 2019-12-05 11:22
geoffclare
 
normal  
Applied  
Accepted As Marked  
   
Geoff Clare
The Open Group
fnmatch()
890
30061
Approved
See Note: 0004561
fnmatch() handling of backslash in bracket expressions
The description of fnmatch() says:
If FNM_NOESCAPE is not set in flags, a <backslash> character in pattern followed by any other character shall match that second character in string.
Since this describes the handling of backslash independently of the description of backslash as a pattern matching special character in 2.13.1, it is not clear whether the requirement (via the reference to XBD 9.3.5 in 2.13.1) that backslash loses its special meaning within a bracket expression applies here.

However, fnmatch() was invented by the original POSIX.2 developers as a way for C programs to perform the same pattern matching that utilities perform (as described in 2.13) and therefore it is almost certain that the intention was for this paragraph to mean that when FNM_NOESCAPE is not set, fnmatch() handles backslash as described in 2.13.1 and when FNM_NOESCAPE is set, backslash is instead treated as a literal character.

I tested a few implementations: Solaris and HP-UX behave as per 2.13.1, glibc and macOS do not.
Since both behaviours are seen on certified systems, we should issue a "standard is unclear" interpretation to allow either behaviour for Issue 7 conformance, but tighten the requirements for Issue 8 to ensure consistency with find -name/-path and the pax pattern operand.

Change:
If FNM_NOESCAPE is not set in flags, a <backslash> character in pattern followed by any other character shall match that second character in string. In particular, "\\" shall match a <backslash> in string.
to:
If FNM_NOESCAPE is not set in flags, a <backslash> character can be used as an escape character as described in [xref to 2.13.1].

Notes
(0004558)
shware_systems   
2019-09-17 16:07   
There should be an exception for <NUL>, imo, as in:
If FNM_NOESCAPE is not set in flags:
A <backslash> character can be used to escape a following character, except for <NUL>, as described in [xref to 2.13.1].

Followed by:
A <backslash> followed by a <NUL> shall be treated as if the <backslash> was not provided and terminates the pattern.

or:
A <backslash> followed by a <NUL> shall be considered an invalid pattern, and the interface shall return with EINVAL stored in errno.

Because the shell has the alternates of double-quoting and <newline> as pattern delimiters, shell tokens not used as utility arguments may include <NUL>, which 2.13.1 has to make allowance for. C strings do not; the use of <NUL> in them as string terminator has priority, that I see. I feel it is clearer to emphasize the distinction here, rather than try to put too much into the resolution of 1284.
(0004559)
geoffclare   
2019-09-19 12:18   
Re: Note: 0004558 the case of an unescaped backslash preceding the NUL terminator of the string is covered by the statement at line 30063:
If pattern ends with an unescaped <backslash>, fnmatch() shall return a non-zero value (indicating either no match or an error).

However, since we are planning to change the similar statement in XCU 2.13.1 to say simply that the behaviour is unspecified, we should consider making an equivalent change here.
(0004561)
nick   
2019-09-23 15:33   
Interpretation response
------------------------

The standard is unclear on this issue, and no conformance distinction can be made between alternative implementations based on this. This is being referred to the sponsor.

Rationale:
-------------

The intention has always been that fnmatch() (with zero flags argument) should implement pattern matching as described in XCU 2.13.1 and 2.13.2. However, the separate description of backslash handling for fnmatch() makes this unclear as regards the requirements for backslash inside bracket expressions.

Notes to the Editor (not part of this interpretation):
-------------------------------------------------------
Make the changes in the Desired Action and also change (P890 L30063):
    
If pattern ends with an unescaped <backslash>, fnmatch( ) shall return a non-zero value (indicating either no match or an error).


to:
    
If pattern ends with an unescaped <backslash>, the behavior is unspecified.
(0004596)
agadmin   
2019-10-07 15:15   
Interpretation proposed: 7 October 2019
(0004642)
agadmin   
2019-11-11 12:19   
Interpretation Approved: 11 Nov 2019




Viewing Issue Advanced Details
1286 [1003.1(2016)/Issue7+TC2] Shell and Utilities Objection Error 2019-09-14 07:59 2019-11-21 14:56
stephane
 
normal  
Applied  
Accepted  
   
Stephane Chazelas
renice utility
3194 (in the 2018 edition)
107129, 107130
---
positive increments *increase* the nice value in renice
> Positive increment values shall cause a lower nice value. Negative increment values
> may require appropriate privileges and shall cause a higher nice value.

No. positive increments increase the nice value (decrease the priority) and negative increments cause a lower nice value (higher priority).

The bug was introduced when the spec changed back from using *system scheduling priorities* to *nice value* between SUSv2 and SUSv3

Compare:

https://pubs.opengroup.org/onlinepubs/007908799/xcu/renice.html [^]
https://pubs.opengroup.org/onlinepubs/009695399/utilities/renice.html [^]
Trim those 2 lines to only:

"Negative increment values may require appropriate privileges."

(no need to say that positive increments increase).
There are no notes attached to this issue.




Viewing Issue Advanced Details
1285 [1003.1(2016)/Issue7+TC2] Shell and Utilities Editorial Error 2019-09-13 04:38 2019-11-21 14:55
dannyniu
 
normal  
Applied  
Accepted As Marked  
   
DannyNiu/NJF
Individual
trap
2420
77484-5
---
Note: 0004570
There should be a line-break before the 2nd trap in the synopsis
My reference at hand shows this in the synopsis for trap:

    trap n [condition...]trap [action condition...]

I was confused before I realize that's 2 different forms of invocation.
Add a line break after ``[condition...]'' and before ``trap''
Notes
(0004557)
geoffclare   
2019-09-13 08:20   
This is actually a bug in the troff source. Its effect is more noticeable in the HTML translation, but can also be seen in the PDF if you look closely at the line spacing compared to other utilities with multiple synopsis lines (such as export, readonly, and set among the built-ins). These all have a .P between the synopsis lines but this is missing for trap.
(0004570)
geoffclare   
2019-09-26 15:40   
Add a paragraph break (troff .P) between the two synopsis lines.




Viewing Issue Advanced Details
1284 [1003.1(2016)/Issue7+TC2] System Interfaces Comment Enhancement Request 2019-09-05 08:33 2019-11-21 14:53
dannyniu
 
normal  
Applied  
Accepted  
   
DannyNiu/NJF
exec
790-791
26830-1, 26836
---
The sense of "checksum" test is too narrow.
In the APPLICATION USAGE section of exec* functions, there's a few mention of the "checksum" test.

In cryptography, checksum test refers specifically using a hash function that may or may not be of cryptographic grade to generate a digest and performs a comparison. A better alternative for "checksum test" would be "integrity test" which can potentially include tests involving digital signature public-key verification, thus providing authenticity and non-repudiation.
Change all instances of "checksum test(s)" to "integrity test(s)".
There are no notes attached to this issue.




Viewing Issue Advanced Details
1283 [1003.1(2016)/Issue7+TC2] System Interfaces Objection Clarification Requested 2019-09-04 09:16 2019-12-05 11:20
geoffclare
 
normal  
Applied  
Accepted As Marked  
   
Geoff Clare
The Open Group
chmod()
665
22780
Approved
See Note: 0004592.
should chmod() ignore file type bits in st_mode
The description of chmod() says:
The chmod() function shall change S_ISUID, S_ISGID, [XSI]S_ISVTX,[/XSI] and the file permission bits of the file named by the pathname pointed to by the path argument to the corresponding bits in the mode argument.

The way this is worded implies that only the specified bits in the mode argument are examined by chmod(), and it should ignore other bits. However, there is also a "may fail" EINVAL error:
The value of the mode argument is invalid.

whose presence could be seen as casting doubt on this interpretation. Alternatively, perhaps this EINVAL error is there so that non-XSI implementations that don't support S_ISVTX can fail the chmod() if an application attempts to set that bit.

Another consideration is whether any implementations support additional settable bits as an extension. If such extensions are allowed, the question becomes whether chmod() should ignore the bits used to encode the file type (and any other "read-only" bits that can be set in st_mode).

This affects code which modifies file permissions by obtaining the st_mode value for a file and setting or clearing some permission bits in the value before passing it to chmod(). For example:
stat(file, &sbuf); chmod(file, sbuf.st_mode | S_IRWXU);
If chmod() is supposed to ignore file-type (and other read-only) st_mode bits then this is safe, since any other bits outside 07777 that are set in st_mode would correspond to additional settable bits supported as an extension.

I have some code which does something like this and it has worked fine for many years on many systems. However, recently a system failed the chmod() with EINVAL.

Is this code supposed to be portable, or is it relying on implementations choosing not to fail with EINVAL if read-only st_mode bits are set the requested mode?
OPTION 1

On page 665 line 22779 section chmod(), change:
The chmod() function shall change S_ISUID, S_ISGID, [XSI]S_ISVTX,[/XSI] and the file permission bits of the file named by the pathname pointed to by the path argument to the corresponding bits in the mode argument. The application shall ensure that the effective user ID of the process matches the owner of the file or the process has appropriate privileges in order to do this.
to:
The chmod() function shall change S_ISUID, S_ISGID, [XSI]S_ISVTX,[/XSI] and the file permission bits of the file named by the pathname pointed to by the path argument to the corresponding bits in the mode argument. If any bits that can be set in the st_mode value returned by stat() but cannot be changed using chmod(), such as the bits that are used to encode the file type, are set in the mode argument, these read-only st_mode bits shall be ignored.

If the effective user ID of the process does not match the owner of the file and the process does not have appropriate privileges, the chmod() function shall fail.

On page 666 line 22834 section chmod(), change:
The value of the mode argument is invalid.
to:
The value of the mode argument, ignoring read-only st_mode bits (see the DESCRIPTION), is invalid.

On page 667 line 22866 section chmod() add a new example:
Modifying File Permissions

The following example adds group write permission to the existing permission bits for a file if that bit is not already set.
#include <sys/stat.h>

struct stat sbuf;
...
if (stat(path, &sbuf) == 0 && (sbuf.st_mode & S_IWGRP) == 0)
    chmod(path, sbuf.st_mode | S_IWGRP);


OPTION 2

On page 665 line 22779 section chmod(), change:
The chmod() function shall change S_ISUID, S_ISGID, [XSI]S_ISVTX,[/XSI] and the file permission bits of the file named by the pathname pointed to by the path argument to the corresponding bits in the mode argument. The application shall ensure that the effective user ID of the process matches the owner of the file or the process has appropriate privileges in order to do this.
to:
The chmod() function shall change S_ISUID, S_ISGID, [XSI]S_ISVTX,[/XSI] and the file permission bits of the file named by the pathname pointed to by the path argument to the corresponding bits in the mode argument. If any other bits are set in the mode argument, the chmod() function may treat the mode argument value as invalid; if the implementation does not support the XSI option, setting the 01000 bit may also cause the mode value to be treated as invalid (01000 is the value of S_ISVTX on implementations that support the XSI option).

If the effective user ID of the process does not match the owner of the file and the process does not have appropriate privileges, the chmod() function shall fail.

On page 667 line 22866 section chmod() add a new example:
Modifying File Permissions

The following example adds group write permission to the existing permission bits for a file if that bit is not already set.
#include <sys/stat.h>

struct stat sbuf;
...
if (stat(path, &sbuf) == 0 && (sbuf.st_mode & S_IWGRP) == 0)
    chmod(path, (sbuf.st_mode & 07777) | S_IWGRP);

On page 667 line 22878 section chmod() add a new first paragraph to APPLICATION USAGE:
When adding or removing permission bits in a file's mode, the st_mode value obtained from, for example, stat() should be masked with 07777 in order to ensure an [EINVAL] error does not occur.
Notes
(0004547)
shware_systems   
2019-09-04 17:10   
This was discussed years ago, due to confusion I had about encoding device types in st_mode. Having additional bits in st_mode is not a conforming extension, to keep it compatible with type short, though it's arguable the ISVTX bit may have a different label and function on non-XSI systems. The means of extending stat left open is to define a new field, not add bits. Then set/test macros that take the entire struct as argument and not the mode field alone, are to be defined, similar to the test for being TYM or SHM, to hide non-portable naming.

I believe the reason for EINVAL being may fail is some device types may not support the particular function a bit designates; for example, a read only FIFO may consider it invalid to attempt to set any W or X permission bits.
(0004548)
kre   
2019-09-04 22:40   
Re bugnote: 4547

There is nothing in posix (nor in some implementations) that suggests that
a mode_t should be 16 bits (short). Adding bits is entirely possible (and
needed to define new file types, if not so much for more permissions, which
would be hard to do in a portable manner).

Re the proposed resolution: I don't much like using 07777 as a "magic"
number, even though the values of all the relevant bits are defined, and
fit (exactly) into that value. Nor am I convinced that giving in to
whatever implementation returned EINVAL is necessarily the corrcet thing
to do.

What would seem more reasonable to me would be to require implementations
accept either all 0's in the "other" bits, or a value that exactly matches
the current settings of those bits, and is permitted to return EINVAL only
in cases where the application has attempted to use chmod to alter the state
of those bits (if it isn't already required somewhere, I'd also add, somewhere,
not related to chmod, a requirement that it is invalid to have all the other
bits 0 for any existing file, whatever its type). EINVAL in cases where an
attempt is made to set permissions that are impossible for the file in question
is OK as well of course.
(0004549)
kre   
2019-09-04 23:03   
OPTION 3:

On page 665 line 22779 section chmod(), change:
    [the same words as in options 1 & 2]
to
    The chmod() function shall change S_ISUID, S_ISGID, [XSI]S_ISVTX,[/XSI]
    and the file permission bits of the file named by the pathname pointed to
    by the path argument to the corresponding bits in the mode argument.
    Any other bits set in the mode argument that differ from the value that
    would be returned in st_mode value returned by stat() of the same path,
    may be treated as an error.

    If the effective user ID of the process does not match the owner of the
    file and the process does not have appropriate privileges, the chmod()
    function shall fail.


On page 666 line 22834 section chmod(), change nothing.

Add the same new example on page 667 line 22866 as is shown for OPTION 1.
(0004550)
shware_systems   
2019-09-05 01:48   
Re: 4548
It is not that it has to be a short, just that what is designated for the type can fit in it and no more file types or mode bits will be added to break that possibility. I was thinking similarly to you, actually, that the intent was it might be extended, with each new file type being represented by its own bit (so testing for or excluding multiple types in one mask/compare was plausible) but this is not the case.
(0004551)
kre   
2019-09-05 03:46   
st_mode in original unix (pdp-11) was an int (16 bits). When ported to
32 bit systems a choice needed to be made .. keep it an int (now 32 bits)
preserving source code compat, or make it short (u_short) keeping binary
compat with the older version. Either choice would have been reasonable,
and it is possible both were made by different implementations. BSD systems
have mode_t as uint32_t (I think all of them, but did not check.)

Representing file types as individual bits was never going to happen, that
ship sailed in the earliest implementations, but adding more new file types
is certainly still possible (many more have been added to the original set
over the years) and any implementation is free to add as many as it likes.
With only 4 bits left available in a short, there's only room for 15 different
file types (reserving 0 as unavailable), but with wider mode_t many more
can be handled - not that I know of any systems with more than 15.

An implementation, using interfaces that are POSIX extensions could also
place other info in the st_mode field - as long as the access methods defined
by the standard still all work, I see no issue with that.

None of that is relevant to chmod() though, which only ever affects those
bottom 12 bits. I think we should keep the standard such that giving a mode
arg to chmod that is < (1<<12) and which contains nothing inappropriate for
the actual file type, should always be valid (I don't believe there's any
question about that one), but I also believe it should be OK to use the value
obtained from st_mode after a stat() while changing any of those bottom 12 bits.
That would be option 1, except I also think it reasonable to allow an
implementation to reject the mode (EINVAL) if any of the other bits from
the st_mode value were altered (so, the "ignored" in option 1 is not my
choice). Note that is "allowed" to the implementation, not required of it.
If an implementation decides to simply ignore all but the bottom 12 bits that
is OK too.
(0004552)
kre   
2019-09-05 04:04   
Just in case it is not clear, I know that the standard is not going
to require more that 16 bit mode_t and that new file types defined
by POSIX will be defined in a way that makes the implementation more
flexible ... but for chmod() purposes, that is irrelevant - what matters
is that implementations with a wider mode_t can choose to add new file
types there - both standard and non-standard ones. Code that manipulates
the st_mode field returned from stat() needs to be aware of that. That
includes code planning on doing chmod(). There are 2 choices for such
code, either pass only the bottom 12 bits of the mode_t value as the mode
arg to chmod(), or pass the entire st_mode changing only some of the bottom
12 bits. Either of those 2 should be specified to work. Any other
manipulation must be permitted to fail (but not required to fail).
(0004553)
geoffclare   
2019-09-05 08:47   
(edited on: 2019-09-05 09:01)
Re: Note: 0004549 I can see a few of problems with option 3:

* It can't just talk about the st_mode that would be returned by stat(); it needs to relate it to stat() for chmod() and for fchmodat() with flag 0, and to lstat() for fchmodat() with flag AT_SYMLINK_NOFOLLOW.
(I realise now that option 1 also has a similar problem, but it can be fixed there just by changing stat() to lstat() in "can be set ... by stat()".)

* If an application clears one or more bits in the part of st_mode that encodes the file type while leaving at least one of those bits set, I think your intention is that this is allowed to be an error, but it's not clear from the wording. Maybe this would be clearer:
If the value represented by all of the other bits in the mode argument is non-zero and differs from the value that those bits would represent in the st_mode value returned by ... of the same path, this may be treated as an error.

* There's no point changing the DESCRIPTION to narrow down the set of mode values that can be treated as an error if the EINVAL wording is left as-is, because it currently gives implementations carte blanche.

(0004592)
Don Cragun   
2019-10-03 16:04   
(edited on: 2019-10-03 16:21)
The standard is unclear on this issue, and no conformance distinction can be made between alternative implementations based on this. This is being referred to the sponsor.

Rationale:
-------------
The selected option matches current existing practice, but alternative implementations have been investigated.

Notes to the Editor (not part of this interpretation):
-------------------------------------------------------
Implement OPTION 1 from the Desired Action, but change:
the st_mode value returned by stat() but cannot
to:
the st_mode value returned by lstat() or stat() but cannot


(0004593)
shware_systems   
2019-10-03 16:28   
Re: 4551
Per the phone discussion 2019-10-03, there is no S_IFMT value reserved as unavailable, so a total of 16 types could be encoded in it and still fit in a short. Per the Future Directions section any use by an implementation of other values, including a reservation of 0 for any purpose, in S_IFMT are non-portable. To show a stat structure does not represent any actual file, an implementation would need to define an S_TYPEISxxx macro for testing that state; an S_IFMT value of 0 does not serve this purpose.
(0004594)
eblake   
2019-10-03 16:30   
Option 3 is less like existing practice. The following example on Linux is proof that setting S_IFLNK bits in st.st_mode before calling chmod() on S_IFREG is ignored, not rejected:

$ cat file.c
#include <stdio.h>
#include <errno.h>
#include <sys/stat.h>

int main(void)
{
  struct stat st;
  lstat("b", &st);
  int i = chmod("a", st.st_mode);
  printf ("%d %d\n", i, errno);
}
$ touch a
$ ln -s a b
$ ./a.out
0 0
(0004595)
kre   
2019-10-04 21:57   
Re Note: 0004593

I have no idea why the phone call even wasted time discussing that, as it
is completely off topic for the issue at hand, but ... I agree, POSIX does not
reserve a value for *nothing* as a file type - why would it? By definition
such a thing does not exist, and so cannot be returned by any of the stat()
functions, and would be just one of many (presumably) invalid values to hand to
mknod() if posix even defined that interface, so calling that one out there
would make no sense either.

That said, I have never seen an implementation where (st_mode & S_IFMT) == 0
represented anything, and I cannot imagine a situation where an implementation
would ever do that ... at the very least, that would be the last S_IFMT value
assigned, and whenever values are being assigned from a potentially unbounded
set (ie: where we cannot prove we will never need any more) and you're reaching
the end of the previously allocated space, some method needs to be found to
add new values in another way. Since the implementation knows it is going to
need to do that, and probably soon(ish) it is far more reasonable to simply do
it now (whenever that "now" is) rather that assigning 0 and deferring the work
until later.

POSIX has recognised that, and made it clear that whatever other way that
implementations choose will be OK (the tests for file types for new file
types will never be expected to work with only st_mode as the arg, but only
with the whole structure available), so why would an implementation use
that one, never before used value, even though it might poausibly have
been use as the stating point, for regular files - but wasn't - for anything?

But as I said, all of this is off topic for chmod, which explicitly is not
touching any of those bits, and completely irrelevant to anything here.

Re Note: 0004594 .. I agree, that is the common implementation, and I certainly
would not have disallowed that. I thought it was clear, that my only point
was to allow an implementation to give an error in that case if it wants to.
That is, I see no particular benefit in allowing the st_mode from one file
being applied to another, without masking out the non-permission bits. I doubt
this is a common programming idiom.

All that said, I have no problem with the choice of option 1 if that is what
everyone believes is the best way.
(0004597)
agadmin   
2019-10-07 15:15   
Interpretation proposed: 7 October 2019
(0004643)
agadmin   
2019-11-11 12:19   
Interpretation Approved: 11 Nov 2019




Viewing Issue Advanced Details
1282 [1003.1(2016)/Issue7+TC2] Base Definitions and Headers Editorial Clarification Requested 2019-08-23 16:19 2019-08-24 21:25
joelsherrill
 
normal  
New  
Open  
   
Joel Sherrill
mqueue.h
NA - used web
NA - used web
---
mqueue.h - pthread_attr_t is listed as defined but unclear why
Ref: https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/mqueue.h.html [^]

Issue: In mqueue.h, pthread_attr_t is listed as defined but there is no need for it as best I can see in the mqueue APIs. Is this just a cut and paste mistake?

Response from Geoff Clare when asked on mailing list:

"It's needed if you want to have mq_notify() do a SIGEV_THREAD notification. However, what's odd here is that mqueue.h is only required to declare struct sigevent as incomplete, so if you want to actually populate a struct sigevent you therefore need to include signal.h to get it properly defined -- in which case signal.h will define pthread_attr_t, so there is no need for mqueue.h to do it.

So I think either mqueue.h should be required to define struct sigevent or it should not be required to define pthread_attr_t."

There are multiple ways to resolve this but I think Geoff's suggestion of not requiring mqueue.h to define pthread_attr_t seems the simplest. It also is consistent with the example provided with mq_notify() that includes pthread.h.

Notes
(0004537)
geoffclare   
2019-08-23 16:41   
The example code includes <pthread.h>, but it doesn't include <signal.h>.

If we don't change <mqueue.h> to require a complete definition of struct sigevent then we should add #include <signal.h> to the example.
(0004539)
shware_systems   
2019-08-24 21:25   
(edited on: 2019-08-24 21:34)
I think the sigevent reference as an incomplete type is so it is not material whether <mqueue.h> #include's <signal.h> or not, or which order an application may #include these headers separately. It would be a quality of implementation issue that these headers note whether the necessary types have been declared already, whichever is referenced first. If an application doesn't make use of mq_notify() it may not have to #include <signal.h> at all then.

What I see as missing, if <mqueue.h> is #include'd first, is that the sigval union should also be defined or declared, as the sigevent structure requires this type also.





Viewing Issue Advanced Details
1281 [1003.1(2016)/Issue7+TC2] Shell and Utilities Objection Omission 2019-08-22 10:39 2019-11-21 14:52
geoffclare
 
normal  
Applied  
Accepted  
   
Geoff Clare
The Open Group
ex
2725
89170-89214
---
ex substitute command missing statement about error
In the description of the ed substitute command there is this statement:
It is an error if the substitution fails on every addressed line.
An equivalent statement is missing from the description of the ex substitute command. (All of the implementations I tried treated it as an error, although macOS did not write an error message, it just exited with a non-zero status.)

The ed text should also be changed so that it uses "shall", and it is in an odd place in the paragraph.
On page 2686 line 87606 section ed, delete:
It is an error if the substitution fails on every addressed line.

On page 2686 line 87609 section ed, change:
The current line shall ...
to:
It shall be an error if the substitution fails on every addressed line. The current line shall ...

On page 2726 line 89178 section ex, change:
... and other special characters.
to:
... and other special characters. It shall be an error if the substitution fails on every addressed line.

There are no notes attached to this issue.




Viewing Issue Advanced Details
1280 [1003.1(2016)/Issue7+TC2] System Interfaces Editorial Clarification Requested 2019-08-16 22:45 2019-11-21 14:49
dalias
 
normal  
Applied  
Accepted As Marked  
   
Rich Felker
musl libc
utimensat/futimens
?
?
---
Note: 0004544
Error requirements with UTIME_OMIT
The DESCRIPTION specifies:

"If both tv_nsec fields are set to UTIME_OMIT, no ownership or permissions check shall be performed for the file, but other error conditions may still be detected (including [EACCES] errors related to the path prefix)."

Note the "may".

However, under ERRORS, there are "shall fail" errors that would apply under the condition where both times are UTIME_OMIT.
Clarify whether these errors are required if both times are UTIME_OMIT. Either add exceptions to the error conditions (e.g. "and not both tv_nsec fields are set to UTIME_OMIT") or replace the "may" in the description with appropriate wording not to imply that these errors are optional.

Note: Linux kernel does not detect any errors in the case where both are UTIME_OMIT. I'm not aware of what other implementations do.
Notes
(0004529)
geoffclare   
2019-08-19 08:51   
I believe the intention was to allow implementations to notice that both times specify UTIME_OMIT and return straight away without making any use of the path or fd. So the "may" in the DESCRIPTION is right and the ERRORS section needs updating.

Rather than having to repeat most errors in "shall fail" and "may fail" forms, perhaps we can do something like this in the intro to each set of errors that are currently "shall fail":
The utimes() function shall fail, the futimens() and utimensat() functions shall fail in the case that the times argument does not have both tv_nsec fields set to UTIME_OMIT, and the futimens() and utimensat() functions may fail in the case that the times argument has both tv_nsec fields set to UTIME_OMIT, if:
(0004531)
dalias   
2019-08-19 14:51   
That looks fine to me. I'm happy with any outcome for this that clarifies the intent. I only hit the issue working on tests for my implementation in conjunction with 64-bit time_t work and wasn't sure what the correct behavior should be.
(0004544)
geoffclare   
2019-08-28 09:29   
Proposed changes ...

On page 988 line 33571 section futimens(), change:
These functions shall fail if
to:
The utimes() function shall fail, the futimens() and utimensat() functions shall fail in the case that the times argument does not have both tv_nsec fields set to UTIME_OMIT, and the futimens() and utimensat() functions may fail in the case that the times argument has both tv_nsec fields set to UTIME_OMIT, if

On page 988 line 33585 section futimens(), change:
The futimens() function shall fail if
to:
The futimens() function shall fail in the case that the times argument does not have both tv_nsec fields set to UTIME_OMIT, and may fail in the case that the times argument has both tv_nsec fields set to UTIME_OMIT, if

On page 988 line 33587 section futimens(), change:
The utimensat() function shall fail if
to:
The utimensat() function shall fail in the case that the times argument does not have both tv_nsec fields set to UTIME_OMIT, and may fail in the case that the times argument has both tv_nsec fields set to UTIME_OMIT, if

On page 988 line 33595 section futimens(), change:
The utimensat() and utimes() functions shall fail if
to:
The utimes() function shall fail, the utimensat() function shall fail in the case that the times argument does not have both tv_nsec fields set to UTIME_OMIT, and the utimensat() function may fail in the case that the times argument has both tv_nsec fields set to UTIME_OMIT, if




Viewing Issue Advanced Details
1279 [1003.1(2016)/Issue7+TC2] Shell and Utilities Objection Error 2019-08-03 22:20 2019-08-22 07:15
stephane
 
normal  
New  
Open  
   
Stephane Chazelas
Shell grammar
---
non-name=value should not be an ASSIGNMENT_WORD
The sh grammar in the spec tells us that

   var=value

is to be parsed as a:

program
 -> complete_commands
 -> complete_command
 -> list
 -> and_or
 -> pipeline
 -> pipe_sequence
 -> command
 -> simple_command
 -> cmd_prefix as an ASSIGNMENT_WORD

(assuming rule 7a is applied, missing in the spec as already
noted in 0001094)

And for:

var+=value
stéphane=foo
var[1]=value
a[0].b[c=++e].f=g
"a=b"=c
$(echo x)=d

Either:

 ...
 -> simple_command
 -> cmd_prefix as an ASSIGNMENT_WORD

Or:

 ...
 -> simple_command
 -> cmd_name as a WORD


IOW, all those examples above are described in the manual as
"simple commands" in the sh language, with no scope for
implementations to interpret them otherwise.

In all those cases, when it's ASSIGNMENT_WORD, 2.10.2 7b defers to
"2.9.1" for how an assignment is to be performed based on that
ASSIGNMENT_WORD.

Except that 2.9.1 doesn't really say that.

From a var=value ASSIGNMENT_WORD, there's nothing that says
that "var" is the name of the variable to be assigned and
"value" the value to assign to the variable. The only thing that
suggests it is the "Assignment to the name within a returned
ASSIGNMENT_WORD token" in 2.10.2/7b. While that's easy to guess
for "var=value", that's less so for the other examples
above. If anything 7b would say that in var+=value, the "name"
of the variable is "var+".

Those examples should make it obvious that while they are (for
some of them) syntax in the bash/ksh93/zsh languages, they are
not in the sh language. The sh grammar should not identify those
as sh simple commands or assignments.

At best, things like var+=value or var[0]=value should be
*allowed* to be interpreted as the "var+=value" command (like
many sh implementations do), but not *required* to as some
shells like ksh/bash/zsh interpret them as something else,
and certainly *cannot* be interpreted as POSIX sh variable
assignments as those are not valid sh variable names.

Note: another bug report will follow to address
www.mail-archive.com/austin-group-l@opengroup.org/msg04563.html">https://www.mail-archive.com/austin-group-l@opengroup.org/msg04563.html [www.mail-archive.com/austin-group-l@opengroup.org/msg04563.html" target="_blank">^]
(0001276 and this one are preamble to that).
First, apply the 0001094 resolution: append a /* Apply rule
7a */ to the first occurrence of ASSIGNMENT_WORD in the
cmd_prefix production, and /* Apply rule 7b */ to the second one
(7a would also work as there's no reserved word that can be
mistaken for an assignment).

In 7b, ASSIGNMENT_WORDs should only be returned for
var=anything tokens (where "var", before quote removal and before
expansion is a valid "name"). For other TOKENs that contain an
unquoted, not-part-of-expansion equals sign, we should make sure
that no grammar production that references rule 7 would
succeed/match, for instance, by saying that the TOKEN token, or
maybe a new one called UNSPECIFIED to make it clearer shall be
returned.

For instance, change 2.10.2/7 (here including a resolution of
0001276) to:


> 7. [Assignment preceding command name]
>
> a. [When the first word]
>
> If the TOKEN is exactly a reserved word, the token identifier for that reserved word shall result. Otherwise, 7b shall be applied.
>
> b. [Not the first word]
>
> If the TOKEN contains an unquoted (as determined while applying rule 4 from Token Recognition) <equals-sign> character that is not
> part of an embedded parameter expansion, command substitution, or arithmetic expansion construct (as determined while applying rule
> 5 from Token Recognition):
>
> • If the TOKEN begins with '=', then the token WORD shall be returned.
>
> • If all the characters in the TOKEN preceding the first such <equals-sign> form a valid name (see XBD Name), the token
> ASSIGNMENT_WORD shall be returned.
>
> • Otherwise, it is unspecified whether the WORD or UNSPECIFIED token is returned.
>
> Otherwise, the token WORD shall be returned.
>
> Assignment to the name within a returned ASSIGNMENT_WORD token shall occur as specified in Simple Commands.


And add a paragraph in 2.10.1 like:

- in the following section, some rules return an UNSPECIFIED
  token. That's a way to make it clear that the resulting token
  cannot possibly satisfy the grammar productions where the
  corresponding rule is referenced.


And then, in 2.9.1, now that an ASSIGNMENT_WORD can only be a
name=value, it's not as critical, but we may still want to
clarify that the part before the first = in the ASSIGNMENT_WORD
is the name of the variable and the part after that = is the value.
Notes
(0004533)
geoffclare   
2019-08-21 11:29   
I don't like the suggestion of making it completely unspecified how non-name=... is parsed. All of the examples you give are things that I would naturally expect to be parsed as some kind of assignment if they are not treated as a cmd_word. If they then can't be processed as a valid assignment, this would produce an assignment error (rather than a syntax error).

So the way I would prefer to handle this is to change the text in 7b from:
Assignment to the name within a returned ASSIGNMENT_WORD token shall occur as specified in [xref to 2.9.1].
to something like:
If a returned ASSIGNMENT_WORD token begins with a valid name, assignment of the value after the first <equals-sign> to the name shall occur as specified in [xref to 2.9.1]. If a returned ASSIGNMENT_WORD token does not begin with a valid name, either an unspecified form of assignment shall be performed (for example, assignment to an array element in shells that support array variables as an extension) or a variable assignment error shall occur; see [xref to 2.8.1] for the consequences of these errors.
(0004535)
stephane   
2019-08-22 07:15   
I just realised 0000351 (about [command [-p]] export/readonly treating what looks like ASSIGNMENT_WORD specially) should also be extended here:

$ touch a0=bar
$ dash -c 'export a[0]=bar; printf "%s\n" "a=$a, a0=$a0"'
a=, a0=bar
$ yash -c 'export a[0]=bar; printf "%s\n" "a=$a, a0=$a0"'
a=, a0=bar
$ ksh -c 'export a[0]=bar; printf "%s\n" "a=$a, a0=$a0"'
a=bar, a0=
$ mksh -c 'export a[0]=bar; printf "%s\n" "a=$a, a0=$a0"'
a=bar, a0=
$ zsh --emulate sh -c 'export a[0]=bar; printf "%s\n" "a=$a, a0=$a0"'
a=bar, a0=
$ bash -c 'export a[0]=bar; printf "%s\n" "a=$a, a0=$a0"'
bash: line 0: export: `a[0]': not a valid identifier
a=, a0=


Those ksh/mksh/zsh/bash don't do globbing there.

They do globbing in:

export "$(echo a)"[0]=bar
or
export *=bar

Maybe that should be handled in a separate bug, maybe the same bug that would address a[foo + bar] tokenisation (https://www.mail-archive.com/austin-group-l%40opengroup.org/msg04563.html) [^] which I said above I would raise when I have the time, as it's the same issue here.

In any case, we should not return ASSIGNMENT_WORD in things like $a=value, *=value, "foo"=bar, as those are all treated as WORD in all export implementations.

I was about to say: "maybe we should change rule 7 here to say that if the part left of the first unquoted = contains quoting or expansion operators then a WORD (as opposed to ASSIGNMENT_WORD or UNSPECIFIED) shall result", but that would not address export a["$var"]=foo.




Viewing Issue Advanced Details
1278 [1003.1(2016)/Issue7+TC2] Shell and Utilities Editorial Error 2019-08-02 06:48 2019-11-20 16:20
dannyniu
 
normal  
Applied  
Accepted As Marked  
   
DannyNiu/NJF
Individual
pax
---
Note: 0004506
Problems with -n flag in the specification of pax.
In Issue 7-ed2018, there's some dubious specification about the -n flag.

There's an -n flag present in the copy mode synopsis for pax, and absent in the write mode.

However, after all the flags been discussed in the OPTIONS section, there's the mention of -n flag's behavior in read and write mode.
Clarify what modes can the -n flag be applied to, and the behavior in these modes.

Correct SYNOPSIS section for pax.
Notes
(0004506)
geoffclare   
2019-08-02 08:17   
The problem with -n in the copy mode synopsis was raised in bug 0001270.

Proposed change to correct the additional problem of -n being mentioned in relation to write mode...

On page 3076 line 102590 section pax, change:
In write mode, the files shall be selected based on the user-specified pathnames as modified by the −n and −u options
to:
In write mode, the files shall be selected based on the user-specified pathnames as modified by the −u option.






Viewing Issue Advanced Details
1277 [1003.1(2016)/Issue7+TC2] Shell and Utilities Editorial Error 2019-07-31 14:33 2019-11-20 16:17
geoffclare
 
normal  
Applied  
Accepted  
   
Geoff Clare
The Open Group
awk
2491
80131
---
Use of <slash> in an ERE in awk
The standard says:

    Using a <slash> character within an ERE requires the escaping
    shown in the following table.

but this is only true when the ERE is in the lexical token ERE.
In an ERE in a string, the opposite is true: escaping must not be used.
(The resolution of bug 0001105 clarifies this by making the meaning
of <backslash><slash> in an ERE in a string undefined.)
Change:
Using a <slash> character within an ERE requires the escaping shown in the following table.
to:
Using a <slash> character within the lexical token ERE (except as one of the two delimiters) requires the escaping shown in the following table.
There are no notes attached to this issue.




Viewing Issue Advanced Details
1276 [1003.1(2013)/Issue7+TC1] Shell and Utilities Objection Error 2019-07-30 13:11 2019-08-22 06:37
stephane
 
normal  
New  
Open  
   
Stephane Chazelas
2.10.2 shell grammar rules
---
incorrect resolution in 0000839
As already raised in 0001094 and 0001100 (though the 0000839 origin had not been identified then), now closed as rejected, the resolution of 839 broke rule 7.

Now 7a becomes redundant, and 7b is no longer useful to qualify the difference between a a cmd_name and cmd_word.

The difference as seen in earlier versions of the spec was to specify that keywords are not to be recognised as such when following redirections or assignments. With the 839 change, a "a=1 for bar" can no longer be parsed as a simple command as rule 7b now defers to rule 1 which means "for" does not give a "WORD" token any more. That defeats the point of having a cmd_word vs cmd_name distinction.

If the point of 839 was to allow implementations to have keywords that contain = characters, then only 7a should have been modified to say: "if the token is a reserved word, return the token for that reserved word, otherwise apply 7b".
Either undo the change for 839 (except for the "else" part) or go with the simpler/clearer grammar approach (at least when it comes to this particular issue) suggested in 0001100 or change the whole rule 7 to:


    7. [Assignment preceding command name]

         a. [When the first word]

            If the TOKEN is exactly a reserved word, the token identifier for that reserved word shall result. Otherwise, 7b shall be applied.

         b. [Not the first word]

            If the TOKEN contains an unquoted (as determined while applying rule 4 from Token Recognition) <equals-sign> character that is not
            part of an embedded parameter expansion, command substitution, or arithmetic expansion construct (as determined while applying rule
            5 from Token Recognition):

               • If the TOKEN begins with '=', then the token WORD shall be returned.

               • If all the characters in the TOKEN preceding the first such <equals-sign> form a valid name (see XBD Name), the token
                 ASSIGNMENT_WORD shall be returned.

               • Otherwise, it is unspecified whether the WORD or ASSIGNMENT_WORD is returned.

            Otherwise, the token WORD shall be returned.

       Assignment to the name within a returned ASSIGNMENT_WORD token shall occur as specified in Simple Commands.
Notes
(0004501)
stephane   
2019-07-30 13:16   
Note that rule 8 (used in the fname production) defers to 7, but doesn't really need to.

What matters there is that a NAME token be not returned when the token doesn't form a valid variable name, so whether rule 7 classifies it as WORD or ASSIGNMENT_WORD doesn't really matter as long as it's not the NAME token
(0004508)
stephane   
2019-08-05 14:23   
We may also want to rename "cmd_name" to something else as it's potentially misleading.

In

cmd arg

cmd is the WORD token identified as "cmd_name"

In


var=value < file cmd arg

cmd is identified as "cmd_word".

In those two examples, "cmd" is the "name of the command" being executed, but neither cmd_word nor cmd_name have to be the command's name like in $(echo cmd arg1) arg2 where the cmd_name is $(echo cmd arg1) but the command's name is "cmd" (assuming the default value of $IFS) or dryrun=; $dryrun cmd arg where cmd_name is $dryrun but the command name is "cmd".

The distinction between cmd_name and cmd_word here is about the token having different constraints when it's preceded by redirections/assignments and not (namely whether keywords are allowed).

Maybe "cmd_word_no_keyword" would be a better wording for "cmd_name".
(0004509)
stephane   
2019-08-05 14:23   
One could also argue that forcing shells to interpret keywords as WORDs when preceded by assignments/redirections is not particularly useful.

Nobody's going to write:

foo=bar for arg

And expect that "for" to be looked up in $PATH.

On the other hand, a shell implementation may want to allow:

2> /dev/null [[ $a -eq $b ]]

Or

TIMEFMT=3 time cmd
...

Which would help with consistency, but is currently not allowed by POSIX as POSIX requires those to be interpreted as simple commands.
(0004530)
geoffclare   
2019-08-19 10:05   
(edited on: 2019-08-19 10:12)
The new wording for rule 7 suggested in the desired action looks good to me (but noting it has further changes proposed in bug 0001279), although it is missing the word "token" in the third bullet item, which should be:
Otherwise, it is unspecified whether the token WORD or ASSIGNMENT_WORD is returned.


Re Note: 0004508 either I'm confused or you have the new names the wrong way round - cmd_word is the one that can't be a keyword. So a new name for cmd_name should imply that it can be a keyword, not that it can't.

Re Note: 0004509 how would the rule 7 wording in the desired action need to be changed if we want to allow either behaviour?

(0004532)
stephane   
2019-08-20 18:34   
Re: Note: 0004530

About Note: 0004508, yes sorry my bad.

About Note: 0004509

That could be: change:

> simple_command : cmd_prefix cmd_word cmd_suffix
> | cmd_prefix cmd_word
> | cmd_prefix
> | cmd_name cmd_suffix
> | cmd_name
> ;
> cmd_name : WORD /* Apply rule 7a */
> ;
> cmd_word : WORD /* Apply rule 7b */
> ;
> cmd_prefix : io_redirect
> | cmd_prefix io_redirect
> | ASSIGNMENT_WORD
> | cmd_prefix ASSIGNMENT_WORD

to (including 0001094 resolution (also included in 0001279)):

> simple_command : cmd_prefix cmd_word cmd_suffix
> | cmd_prefix cmd_word
> | cmd_prefix
> | cmd_word cmd_suffix
> | cmd_word
> ;
> cmd_word : WORD /* Apply rule 7 */
> ;
> cmd_prefix : io_redirect
> | cmd_prefix io_redirect
> | ASSIGNMENT_WORD /* Apply rule 7 */
> | cmd_prefix ASSIGNMENT_WORD /* Apply rule 7 */
> cmd_suffix : io_redirect
> | cmd_suffix io_redirect
> | WORD
> | cmd_suffix WORD

and rule 7 to:

> 7. [Assignment preceding command name]
>
> * If the TOKEN is exactly a reserved word, the token
> identifier for that reserved word shall result.
>
> * Otherwise
>
> If the TOKEN contains an unquoted (as determined
> while applying rule 4 from Token Recognition)
> <equals-sign> character that is not part of an
> embedded parameter expansion, command substitution,
> or arithmetic expansion construct (as determined
> while applying rule 5 from Token Recognition):
>
> • If the TOKEN begins with '=', then the token
> WORD shall be returned.
>
> • If all the characters in the TOKEN preceding
> the first such <equals-sign> form a valid name
> (see XBD Name), the token ASSIGNMENT_WORD shall
> be returned.
>
> • Otherwise, it is unspecified whether the WORD
> or ASSIGNMENT_WORD is returned.
>
> * Otherwise, the token WORD shall be returned.
>
> Assignment to the name within a returned ASSIGNMENT_WORD
> token shall occur as specified in Simple Commands.

And remove the "Otherwise, rule 7 applies" from rule 8 which doesn't make much sense (rule 7 will never yield a NAME token). Or replace with "Otherwise, a UNSPECIFIED token is returned" (see 0001279).


That would specify that "foo=bar for arg" is not any more a valid POSIX sh "simple command" than "for arg".

And foo=bar() { blah; } would still not be a valid sh function declaration, as foo=bar would still not be seen as a NAME token.
(0004534)
stephane   
2019-08-22 06:37   
Note that 0000351 (about [command [-p]] export/readonly treating what looks like ASSIGNMENT_WORD specially) is related, but it doesn't seem like the proposed resolutions here would affect it.

That bug should be kept in mind when touching rule 7.




Viewing Issue Advanced Details
1275 [1003.1(2016)/Issue7+TC2] Shell and Utilities Objection Error 2019-07-30 10:17 2019-07-30 14:31
geoffclare
 
normal  
New  
Open  
   
Geoff Clare
The Open Group
2.13.3
2384
76271
---
pathname expansion errors
In the description of patterns used for pathname expansion, the standard
states:
Specified patterns shall be matched against existing filenames and pathnames, as appropriate. Each component that contains a pattern character shall require read permission in the directory containing that component. Any component, except the last, that does not contain a pattern character shall require search permission.
However, it does not say what should happen if these permissions are
denied.

The rules about error handling in utilities (1.4) and in the shell (2.8.1)
seem to imply that the shell should report it as an error and a
non-interactive shell should exit, but this is not what existing shells
do. They instead treat it as a successful "no match" condition.

There are other error conditions that should also be treated the same
way, such as ELOOP, ENAMETOOLONG, ENOENT and ENOTDIR, since they imply
that there are no matches (that are accessible to the process).

For other conditions such as EMFILE, ENFILE, EIO and EOVERFLOW, there
are opposing views as to how they should be handled. (For details
refer to the email thread with subject "Re: [1003.1(2016)/Issue7+TC2
0001255]: improper shell code in glob() example" starting on Jul 21.)
Thus the suggested changes have two options, one which requires that
the shell treats these as an error and the other making it unspecified.

Changes are also proposed for glob(), since this is intended to mimic
how the shell does pathname expansion (when GLOB_NOCHECK is used).
However, these changes may need tweaking in the light of bug 0001273.
(In particular, should glob() always ignore ELOOP, ENAMETOOLONG, ENOENT
and ENOTDIR like shells do?)
On page 2384 line 76274 section 2.13.3, after:
Each component that contains a pattern character shall require read permission in the directory containing that component. Any component, except the last, that does not contain a pattern character shall require search permission.
add the sentence:
If these permissions are denied, or if an attempt to open or search a directory fails because of an error condition that is related to file system contents, this shall not be considered an error and pathname expansion shall continue as if the directory had existed and had been successfully opened or searched, and no matching directory entries had been found in it.

For OPTION1 also add the sentence:
Other error conditions shall cause pathname expansion to fail.

For OPTION2 also add the sentence:
For other error conditions it is unspecified whether pathname expansion fails or they are treated the same as when permission is denied.

Cross-volume changes to XRAT ...

On page 3749 line 128728 section C.2.13.3, after:
Historical systems have varied in their permissions requirements. To match f*/bar has required read permissions on the f* directories in the System V shell, but the Shell and Utilities volume of POSIX.1-2017, the C shell, and KornShell require only search permissions.
add to the paragraph:
If read or search permission is denied, shells do not report an error but treat this as a successful "no match" condition. Error conditions that are related to file system contents and occur when attempting to read or search a directory are also required to be treated the same way because they imply that there are no matches (that are accessible to the process). For example, if the pattern is foo/*bar and attempting to open the directory foo fails because it does not exist or is not a directory, then there can be no matching pathnames. The error conditions listed in [xref to XSH 2.3] that are related to file system contents and could occur when attempting to open or search a directory are [EACCES], [ELOOP], [ENAMETOOLONG], [ENOENT] and [ENOTDIR]. Error conditions that are not related to file system contents or which occur when reading a directory, notably [EMFILE] and [ENFILE] but also things like [EIO], [ENOMEM] and [EOVERFLOW],

For OPTION1 also add:
are treated as errors because to do otherwise would mean the shell could execute a command with an unchanged pattern when pathnames matching the pattern exist. Implementations which encounter non-standard error conditions should handle them appropriately according to whether or not they are related to file system contents.

For OPTION2 also add:
can either be treated as errors or be treated the same way as when permission is denied. Treating them as errors is seen as desirable, because to do otherwise would mean the shell could execute a command with an unchanged pattern when pathnames matching the pattern exist, but it is not historical practice. Implementations that handle the two categories of error differently should also handle non-standard error conditions appropriately, if encountered, depending on which category they fit into.

Cross-volume changes to XSH ...

On page 1109 line 37542 section glob(), change GLOB_ERR from:
Cause glob() to return when it encounters a directory that it cannot open or read. Ordinarily, glob() continues to find matches.
to:
Cause glob() to return when an attempt to open, read or search a directory fails because of an error condition that is related to file system contents. If this flag is not set, glob() shall not treat such conditions as an error, and shall continue to look for matches.

For OPTION2 also add:
Other error conditions may also be treated the same way as error conditions that are related to file system contents.

On page 1110 line 37568 section glob(), change:
If, during the search, a directory is encountered that cannot be opened or read ...
to:
If, during the search, an attempt to open, read or search a directory fails ...

On page 1110 line 37574 section glob(), change:
If (*errfunc()) is called and returns non-zero, or if the GLOB_ERR flag is set in flags, glob() shall stop ...
to:
OPTION1: If (*errfunc()) is called and returns non-zero, or if errfunc is a null pointer and the attempt failed because of an error condition that is not related to file system contents, or if the GLOB_ERR flag is set in flags, glob() shall stop ...

OPTION2: If (*errfunc()) is called and returns non-zero, or, optionally, if errfunc is a null pointer and the attempt failed because of an error condition that is not related to file system contents, or if the GLOB_ERR flag is set in flags, glob() shall stop ...

On page 1111 line 37589 section glob(), change GLOB_ABORTED from:
The scan was stopped because GLOB_ERR was set or (*errfunc()) returned non-zero.
to:
OPTION1: The scan was stopped because (*errfunc()) was called and returned non-zero, or errfunc was a null pointer and an attempt to open, read or search a directory failed because of an error condition that is not related to file system contents, or GLOB_ERR was set.

OPTION2: The scan was stopped because (*errfunc()) was called and returned non-zero, or, optionally, errfunc was a null pointer and an attempt to open, read or search a directory failed because of an error condition that is not related to file system contents, or GLOB_ERR was set.

After applying bug 1255, on page 1111 line 37604 section glob(), change:
    glob("*.c", GLOB_DOOFFS|GLOB_NOCHECK|GLOB_ERR, NULL, &globbuf);
to:
    glob("*.c", GLOB_DOOFFS|GLOB_NOCHECK, NULL, &globbuf);

After applying bug 1255, on page 1111 line 37612 section glob(), change:
    glob("*.c", GLOB_DOOFFS|GLOB_NOCHECK|GLOB_ERR, NULL, &globbuf);
    glob("*.h", GLOB_DOOFFS|GLOB_NOCHECK|GLOB_ERR|GLOB_APPEND, NULL, &globbuf);
to:
    glob("*.c", GLOB_DOOFFS|GLOB_NOCHECK, NULL, &globbuf);
    glob("*.h", GLOB_DOOFFS|GLOB_NOCHECK|GLOB_APPEND, NULL, &globbuf);

On page 1112 line 37630 section glob(), add a paragraph to APPLICATION USAGE:
It is recommended that (*errfunc()) should always return a non-zero value if the eerrno parameter indicates an error condition that is not related to file system contents. See [xref to C.2.13.3] for information about which error conditions are related to file system contents.
Notes
(0004502)
eblake   
2019-07-30 14:31   
While I personally agree with option 1 if we were designing from scratch, I feel that it represents enough invention when compared to existing practice that we are better off using option 2 rather than forcing every shell out there to comply with new requirements.




Viewing Issue Advanced Details
1274 [1003.1(2016)/Issue7+TC2] Base Definitions and Headers Editorial Omission 2019-07-28 10:42 2019-09-10 09:13
dannyniu
 
normal  
New  
Open  
   
DannyNiu/NJF
Individual
<sys/types.h> header
402-405
13652-13746
---
pid_t must fit in an int for definition of fcntl to be consistent.
As the return type of fcntl is int, and F_GETOWN returns the process (-group) receiving SIGIO and SIGURG signals, pid_t must fit into an int for the fcntl function to be consistent.

However, there's not yet any explicit mention of this requirement in the standard text.
After line 13731, insert:

the width of pid_t shall be no greater than the width of int.
Notes
(0004496)
jilles   
2019-07-28 13:49   
There is indeed an inconsistency here, but it could be resolved in various ways, depending on compatibility issues and risk of weirdnix implementations:

* Only require that actual process IDs fit in an int, without requiring anything new about the type pid_t.
* Require "the width of pid_t shall be no greater than the width of int".
* Require that the range of pid_t be equal to the range of int.
* Require that pid_t be a typedef for int.

Types smaller than int may have surprising behaviour due to integer promotion, which suggests not going out of our way to allow them.

In hindsight, it may have made more sense to define a new function instead of standardizing the existing fcntl() (compare tcgetpgrp()), but I don't think it is useful to add such a function now.
(0004497)
shware_systems   
2019-07-29 00:52   
The requirement currently is "the width of pid_t shall be no greater than the width of intmax_t.", with the phrasing of <sys/types.h>, and limiting it to int is a potential breaking change. As a future consideration for implementations this phrasing is appropriate for <sys/types.h>, from a speed, packing, or alignment perspective, and should not be modified how this bug report requests, in my opinion.

The aspect of fcntl() that deals with this currently is it is required to set EOVERFLOW in errno if the process ID value is larger than INT_MAX or smaller than INT_MIN, per line 27934. From an 'are our bases covered?' perspective this is adequate. Fixing the interface so the possibility of this error is precluded requires invention, as in an F_GETPID variation that stores the result in an va_arg value cast as (pid_t *), not return it as cast to an int. The return value then could be a flag that the ID is valid or zero, or to check errno for other error conditions, such as fd not referring to a socket. Such a variation I do see as useful, and should be easy to implement.
(0004498)
geoffclare   
2019-07-29 09:08   
There is no problem with F_GETOWN here because the value to be returned is guaranteed to be able to fit in an int. This is because it is set via an int argument passed to fcntl() with F_SETOWN. (Okay, there might be an extension that could be used to set it to a larger value, but then there would be an equivalent extension to query it as well.)

So the only real problem I see is that it is not possible for a process with a PID greater than INT_MAX or a process group with a PGID greater than INT_MAX+1 to be set to receive SIGURG signals (without using an extension).

One solution would be to require that process IDs are always <= INT_MAX. Another would be to make F_SETOWN and F_GETOWN obsolescent and warn about the problem in APPLICATION USAGE.

There is certainly no need to alter the requirements about how pid_t can be defined.
(0004524)
eblake   
2019-08-15 16:37   
Linux fcntl has F_SETOWN_EX/F_GETOWN_EX that takes a pointer to:
                  struct f_owner_ex {
                      int type;
                      pid_t pid;
                  };
as a way that would allow access to pid_t larger than int (although at the moment pid_t on Linux is still int)
(0004525)
shware_systems   
2019-08-15 16:52   
Re: 4524
Do you have a kernel.org URL to the header where that is typedef'd, or the man page showing that?
(0004526)
eblake   
2019-08-15 16:58   
<fcntl.h>, per http://man7.org/linux/man-pages/man2/fcntl.2.html [^]
(0004527)
shware_systems   
2019-08-15 19:03   
According to the linux kernel header <repo URL>\include\uapi\asm-generic\fcntl.h, the definition is:
struct f_owner_ex {
    int type;
    __kernel_pid_t pid;
};

where __kernel_pid_t may not be assignment compatible with pid_t as defined by a <sys/types.h> that non-kernel sources have access to. There is code that establishes this compatibility when pid_t is expected to be typedef'd as int by <sys/types.h>, but it can be overridden by source files with a conflicting definition of __kernel_pid_t before the #include of that <fcntl.h> header.

This is a conflict between the docs and the source, at the least. Whether a bug report should be filed with the docs people or the source maintainers I couldn't say for sure. It appears the source allows the redefinition for backwards compatibility with processor specific optimizations for versions of the kernel before LINUX deferred to POSIX for some interface definitions. As such I lean to the docs being correct, the source has a regression.
(0004528)
eblake   
2019-08-15 19:32   
(edited on: 2019-08-15 19:33)
re: Note: 0004527
__kernel_pid_t is in the namespace reserved to the implementation, and the standard already forbids users from redefining pid_t (or any other type not starting with __ but ending with _t) from what the standard headers provide. Code that tries to redefine either type before including <fcntl.h> is broken for stomping on reserved namespace, and as such is irrelevant to proper usage of the interface. The libc (whether glibc or musl) is responsible for making sure the userspace struct in <fcntl.h> which defines struct f_owner_ex in terms of userspace pid_t will properly coordinate over to whatever syscall mechanism it uses to communicate with the kernel's internal struct, regardless of whether the kernel has different sizing internally from the public interface. In short, I see no bug here, in either the docs or the public headers.

(0004536)
geoffclare   
2019-08-23 16:15   
Proposed changes...

On page 238 line 8002 section <fcntl.h>, add:
The <fcntl.h> header shall define the f_owner_ex structure, which shall include at least the following members:
enum f_pid_type type     Discriminator for pid
pid_t pid                Process ID or process group ID

The <fcntl.h> header shall define the enumerated type enum f_pid_type whose enumerators shall include at least the following:

F_OWNER_PID
The pid member of f_owner_ex holds a process ID.
F_OWNER_PGRP
The pid member of f_owner_ex holds a process group ID.

On page 238 line 8016 section <fcntl.h>, change:
F_GETOWN
Get process or process group ID to receive SIGURG signals.
F_SETOWN
Set process or process group ID to receive SIGURG signals.
to:
F_GETOWN
Get process or process group ID to receive SIGURG signals, via int type.
F_GETOWN_EX
Get process or process group ID to receive SIGURG signals, via pid_t type.
F_SETOWN
Set process or process group ID to receive SIGURG signals, via int type.
F_SETOWN_EX
Set process or process group ID to receive SIGURG signals, via pid_t type.

On page 821 line 27817 section fnctl(), add:
F_GETOWN_EX
If fildes refers to a socket, get the process ID or process group ID specified to receive SIGURG signals when out-of-band data is available, by setting the type and pid members of the f_owner_ex structure pointed to by the third argument, arg. The value of type shall be F_OWNER_PID or F_OWNER_PGRP to indicate that pid contains a process ID or a process group ID, respectively. The value of pid shall be zero if no SIGURG signals are to be sent. If fildes does not refer to a socket, the results are unspecified.
F_SETOWN_EX
If fildes refers to a socket, set the process ID or process group ID specified to receive SIGURG signals when out-of-band data is available, using the value of the third argument, arg, taken as type pointer to struct f_owner_ex. The type and pid members of this structure shall be used as follows:
  • A pid value of zero shall indicate that no SIGURG signals are to be sent.
  • A type value of F_OWNER_PID and a positive pid value shall indicate that SIGURG signals are to be sent to the process ID specified in pid.
  • A type value of F_OWNER_PGRP and a positive pid value shall indicate that SIGURG signals are to be sent to the process group ID specified in pid.
If fildes does not refer to a socket, the results are unspecified.

Move the text from the F_SETOWN description on lines 27803-27816, beginning "Each time a SIGURG signal is sent" and ending "or by other means", to a separate paragraph after the F_SETOWN_EX description, and in it change:
Each time a SIGURG signal is sent
to:
For F_SETOWN and F_SETOWN_EX, each time a SIGURG signal is sent

On page 823 line 27923 section fcntl() (EINVAL shall fail), change:
The cmd argument is invalid, or the cmd argument is F_DUPFD or F_DUPFD_CLOEXEC and arg is negative or greater than or equal to {OPEN_MAX}, or the cmd argument is F_GETLK, F_SETLK, or F_SETLKW and the data pointed to by arg is not valid, or fildes refers to a file that does not support locking.
to:
The cmd argument is invalid; or the cmd argument is F_DUPFD or F_DUPFD_CLOEXEC and arg is negative or is greater than or equal to {OPEN_MAX}; or the cmd argument is F_SETOWN_EX and the type member of the f_owner_ex structure pointed to by arg is invalid, or the pid member is negative and the type member is F_OWNER_PID or F_OWNER_PGRP; or the cmd argument is F_GETLK, F_SETLK, or F_SETLKW and the data pointed to by arg is not valid, or fildes refers to a file that does not support locking.

On page 824 line 27938 section fcntl() (ESRCH), change:
F_SETOWN
to:
F_SETOWN or F_SETOWN_EX

On page 824 line 27944 section fcntl() (EINVAL may fail), change:
The cmd argument is F_SETOWN and the value of the argument is not valid as a process or process group identifier.
to:
The cmd argument is F_SETOWN and the value of arg is positive and is not valid as a process ID or the value of arg is negative and its absolute value is not valid as a process group ID; or the cmd argument is F_SETOWN_EX, the value of the type member of the f_owner_ex structure pointed to by arg is F_OWNER_PID, and the value of the pid member is not valid as a process ID; or the cmd argument is F_SETOWN_EX, the value of the type member of the f_owner_ex structure pointed to by arg is F_OWNER_PGRP, and the value of the pid member is not valid as a process group ID.

On page 824 line 27946 section fcntl() (EPERM), change:
F_SETOWN
to:
F_SETOWN or F_SETOWN_EX

On page 825 line 28011 section fcntl() APPLICATION USAGE, change:
On systems which do not perform permission checks at the time of an fcntl() call with F_SETOWN, ...
to:
On implementations where process IDs can be greater than INT_MAX, F_SETOWN cannot be used with process IDs greater than INT_MAX or process group IDs greater than INT_MAX+1 because the value is passed to fcntl() in an argument of type int. In this situation, F_SETOWN_EX should be used instead.

Similarly, if a process ID greater than INT_MAX or a process group ID greater than INT_MAX+1 has been set to receive SIGURG signals (using F_SETOWN_EX), F_GETOWN cannot be used to obtain the value because fcntl() returns the value as type int and will thus give an [EOVERFLOW] error for such values. F_GETOWN_EX should be used instead.

Note that the convention of negating a process group ID is only used with F_SETOWN and F_GETOWN; the pid member of the f_owner_ex structure used with F_SETOWN_EX and F_GETOWN_EX is not negated when it specifies a process group ID.

On systems which do not perform permission checks at the time of an fcntl() call with F_SETOWN or F_SETOWN_EX, ...

On page 827 line 28077 section fcntl() RATIONALE, add a new paragraph:
The F_SETOWN_EX and F_GETOWN_EX values for cmd and the associated f_owner_ex structure were adopted from the GNU C library. In addition to the values F_OWNER_PID and F_OWNER_PGRP for the type member, this also has F_OWNER_TID to specify that the pid member contains a thread ID. However, this relies on thread IDs being representable in a pid_t and so was not included in POSIX.1-20xx. The aim of adding F_SETOWN_EX and F_GETOWN_EX was to address the inability of F_SETOWN and F_GETOWN to handle process IDs greater than INT_MAX and process group IDs greater than INT_MAX+1, and this need is satisfied without including F_OWNER_TID.
(0004538)
shware_systems   
2019-08-24 19:56   
(edited on: 2019-08-24 19:58)
The point of the etherpad discussion is it is unsafe to use an enum in the structure definition, and dicey even to use int, due to the potential for the structure to have varying size and therefore break va_arg() type casting for access to structure members. While not as problematic for static compiles, for dynamic link compiles such as in hosted environments or where libc is a shared module it is a potential issue, even when the same compiler Programming Environment is used to compile all modules. As such, the man page, glibc and kernel implementations are all buggy from a portability standpoint. That they work for most applications is accident more than by design and could be considered an exploitable security hole to boot.

So it is immaterial whether an enum or symbolic constants via #define are used to define the index values, the structure definition should be:

The <fcntl.h> header shall define the f_owner_ex structure, which shall begin with the following members, and may include others after these:

int32_t type Discriminator for pid
pid_t pid Process ID or process group ID

with the way the C and POSIX standards are worded now.

The use of int32_t is compatible with 32 bit RISC and CISC alignment requirements, afaik, and the POSIX requirement int type be at least 32-bits. If a 64-bit RISC processor requires 64-bit alignment for fast access it may be better to use int64_t instead. While the pid_t type has similar potential varying width issues, with the above at least references to it will have the same base address.

No one should care that this adds a dependency on <stdint.h> to <fcntl.h>; this is the sort of circumstance that header was created for, and c11 to add the _Align* keywords and <stdalign.h>.

Additional nit, in last paragraph:
     However, this relies on thread IDs being representable in a pid_t and so was not included in POSIX.1-20xx.

should be, I think:
     However, this relies on thread IDs, as represented by the thread_t type, being assignment compatible with pid_t (and have equal or smaller range of allowed values than pid_t) and so was not included in POSIX.1-20xx.

to accentuate <sys/types.h> does not require such compatibility between thread_t and pid_t.

(0004540)
geoffclare   
2019-08-27 15:31