Austin Group Defect Tracker

Aardvark Mark IV


Viewing Issue Simple Details Jump to Notes ] Issue History ] Print ]
ID Category Severity Type Date Submitted Last Update
0001121 [1003.1(2016/18)/Issue7+TC2] System Interfaces Editorial Clarification Requested 2017-02-24 18:47 2019-12-04 11:32
Reporter djdelorie View Status public  
Assigned To ajosey
Priority normal Resolution Accepted As Marked  
Status Applied  
Name DJ Delorie
Organization Red Hat Inc
User Reference https://bugzilla.redhat.com/show_bug.cgi?id=1422736 [^]
Section nftw()
Page Number http://pubs.opengroup.org/onlinepubs/9699919799/ [^]
Line Number n/a
Interp Status Approved
Final Accepted Text Note: 0004074
Summary 0001121: is the stat data undefined for dangling symlinks, without FTW_PHYS?
Description The docs say...
if FTW_PHYS is clear ... nftw() shall follow links instead of reporting them
and
stat buffer ... as if fstatat(), stat(), or lstat() had been called

The example included assumes that links are reported, and not followed, when they're dangling, which conflicts with the first of the above, and the second of the above provides no guidance as to which of stat vs lstat is called.

Please clarify...

If FTW_PHYS is clear, and a dangling link is encountered, is lstat() (or the equivalent) called to ensure that the stat data passed to the callback is defined? Or is the stat data explicitly undefined for that case?
Desired Action Clarification of the contents of the stat data when FTW_PHYS is clear and a dangling symbolic link is encountered.
Tags tc3-2008
Attached Files

- Relationships

-  Notes
(0003570)
shware_systems (reporter)
2017-02-24 20:38

From stat():
"If the named file is a symbolic link, the stat() function shall continue pathname resolution using the contents of the symbolic link, and shall return information pertaining to the resulting file if the file exists."

From this I believe the intent is the stat structure contents is not filled in if stat() returns ENOENT, or may be garbage related to using the structure as a temp area while attempting the path resolution continuation, so when FTW_SLN is passed to fn it should be treating the structure as undefined. Certainly not with reliable values, anyways, except possibly time stamps (see below). I'd expect the FTW argument to reflect base and level appropriate to the link path in the first argument still.

It is then nominally the responsibility of the fn function to call fstatat() or lstat() as an additional operation to fill in fields related to the link itself if this is desired, not that of nftw(), as best I can tell.

It probably should be explicit in stat() whether the structure stays unmodified if any error encountered, not just this one; can be garbage; or shall reflect any implicit timestamp flushes that succeeded for the last path element passed in even if other fields not reliable.
(0003571)
carlos (reporter)
2017-02-25 03:06

I expect the stat buffer to be undefined if FTW_SLN is passed to fn. The caller stated their intent by _not_ setting FTW_PHYS.

The only other interpretation is that the stat buffer must be filled by the contents of the dangling symlink, and that is at odds with the caller's intent (didn't set FTW_PHYS) and forces an either an extra stat or lstat to be called (performance cost).

There has been an argument made here:
https://bugzilla.redhat.com/show_bug.cgi?id=1422736 [^]
that the behaviour of providing the dangling symlink data in the buffer is well established. This doesn't mean that POSIX should require it though.
(0003574)
mtk (reporter)
2017-02-26 19:59

It's worth noting that from my research (following https://bugzilla.redhat.com/show_bug.cgi?id=1422736), [^] FreeBSD, Solaris, OpenBSD, and Musl libc all populate the stat buffer passed to the callback function with the results of lstat() on the symbolic link. That was *all* of the implementations that I tested (other than Linux/glibc, which is aberrant). Included below is a little test program that can be used to check what other implementations do.

Quoting myself from that bug report:

I believe this is actually a (very longstanding) glibc bug. Here is what POSIX says for nftw():

           FTW_NS The stat() function failed on the object because of
                     lack of appropriate permission. The stat buffer
                     passed to fn is undefined. Failure of stat() for any
                     other reason is considered an error and nftw() shall
                     return −1.

           ....

           FTW_SLN The object is a symbolic link that does not name an
                     existing file. (This condition shall only occur if
                     the FTW_PHYS flag is not included in flags.)

Note that POSIX explicitly says that the stat buffer is undefined for FTW_NS, but makes no such statement for FTW_SLN, with the implication that the stat buffer is valid in this case.

This implies that FTW_SLN should work as Han Pingtian suggested: for a dangling symlink, the lstat() information on the link should be returned. This is certainly how I always understood things should work. (But, obviously, I never tested this on glibc.)

So, what do other implementations do? Every other implementation that I looked at, does return the lstat() information for the dangling symlink. I looked at Solaris, OpenBSD, FreeBSD, and musl. All of this strongly suggests that glibc got it wrong.

For the points below, I used the following test program (and yes, I realize by now that the FTW_NS treatment in this code is not correct; I've fixed the man page already).

8x---8x---8x---8x---8x---8x---8x---8x---8x---8x---8x---8x---
/*#* t_nftw.c

   Copyright Michael Kerrisk 2000

   Demonstrate the use of the nftw(3) function.
*/
#define _GNU_SOURCE
#define _XOPEN_SOURCE 500
#include <ftw.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>


static int
displayFileInfo(const char *fpath, const struct stat *sb,
            int tflag, struct FTW *ftwbuf)
{
    printf("%-3s %2d %7lld %-30s %d %s (st_ino: %ld)\n",
        (tflag == FTW_D) ? "d" : (tflag == FTW_DNR) ? "dnr" :
    (tflag == FTW_DP) ? "dp" : (tflag == FTW_F) ? "f" :
    (tflag == FTW_NS) ? "ns" : (tflag == FTW_SL) ? "sl" :
    (tflag == FTW_SLN) ? "sln" : "???",
    ftwbuf->level, (long long) sb->st_size,
    fpath, ftwbuf->base, fpath + ftwbuf->base, (long) sb->st_ino);
    memset((void *) sb, 0, sizeof(struct stat));
    return 0; /* To tell nftw() to continue */
}


int
main(int argc, char *argv[])
{
    int flags = 0;

    if (argc > 2 && strchr(argv[2], 'd') != NULL)
    flags |= FTW_DEPTH;
    if (argc > 2 && strchr(argv[2], 'p') != NULL)
    flags |= FTW_PHYS;

    if (nftw((argc < 2) ? "." : argv[1], displayFileInfo,
        20, flags) == -1) {
    perror("nftw");
    exit(EXIT_FAILURE);
    }

    exit(EXIT_SUCCESS);
}
8x---8x---8x---8x---8x---8x---8x---8x---8x---8x---8x---8x---


Solaris (Illumos source code)
usr/src/lib/libc/port/gen/nftw.c:

The following code causes the stat buffer to be populated with lstat() infor in the FTW_SLN case:

        } else {
                /*
                 * Statf has failed. If stat was used instead of lstat,
                 * try using lstat. If lstat doesn't fail, "comp"
                 * must be a symbolic link pointing to a non-existent
                 * file. Such a symbolic link should be ignored.
                 * Also check the file type, if possible, for symbolic
                 * link.
                 */
                if ((vp->statf == stat) && (lstat(comp, &statb) >= 0) &&
                    ((statb.st_mode & S_IFMT) == S_IFLNK)) {

                        /*
                         * Ignore bad symbolic link, let "fn"
                         * report it.
                         */

                        errno = ENOENT;
                        type = FTW_SLN;
                } else {
                        type = FTW_NS;
        fail:


Testing shows that the link info *is* returned in the stat structure:

$ ls -li t
total 4
     45068 -rw-r--r-- 1 mtk csw 29 Feb 24 04:28 f
     45067 lrwxrwxrwx 1 mtk csw 6 Feb 24 04:28 my_sln -> ssssss
     45069 lrwxrwxrwx 1 mtk csw 1 Feb 24 04:28 sl_f -> f

$ ./a.out t
d 0 5 t 0 t (st_ino: 45066)
sln 1 6 t/my_sln 2 my_sln (st_ino: 45067)
f 1 29 t/f 2 f (st_ino: 45068)
f 1 29 t/sl_f 2 sl_f (st_ino: 45068)

======

OpenBSD

I didn't look at the source code, but the test gives the same results as Solaris:

-bash-4.3$ ls -li
total 4
4693795 -rw-r--r-- 1 mtk mtk 29 Feb 24 04:37 f
4693796 lrwxr-xr-x 1 mtk mtk 1 Feb 24 04:37 sl_f -> f
4693797 lrwxr-xr-x 1 mtk mtk 11 Feb 24 04:37 sln -> jajdhfdskjh
-bash-4.3$ ./a.out t
d 0 512 t 0 t (st_ino: 4693794)
f 1 29 t/f 2 f (st_ino: 4693795)
f 1 29 t/sl_f 2 sl_f (st_ino: 4693795)
sln 1 11 t/sln 2 sln (st_ino: 4693797)

=====

FreeBSD

I don't have access to a FreeBSD test system at the moment, nut my reading of the source code is that id delivers the same results as Solaris and OpenBSD

See lib/libc/gen/nftw.c, where FTS_SLN is implemented using the FTS_SLNONE option, and fts(3) on that system says:

              FTS_SLNONE A symbolic link with a nonexistent target. The
                          contents of the fts_statp field reference the
                          file characteristic information for the sym‐
                          bolic link itself.

=====
musl libc

src/misc/nftw.c

        if ((flags & FTW_PHYS) ? lstat(path, &st) : stat(path, &st) < 0) {
                if (!(flags & FTW_PHYS) && errno==ENOENT && !lstat(path, &st))
                        type = FTW_SLN;
                else if (errno != EACCES) return -1;
                else type = FTW_NS;
        } else if (S_ISDIR(st.st_mode)) {
                if (access(path, R_OK) < 0) type = FTW_DNR;
                else if (flags & FTW_DEPTH) type = FTW_DP;
                else type = FTW_D;
        } else if (S_ISLNK(st.st_mode)) {
                if (flags & FTW_PHYS) type = FTW_SL;
                else type = FTW_SLN;
        } else {
                type = FTW_F;
        }
(0003675)
djdelorie (reporter)
2017-04-26 17:06

Any progress on this?
(0003676)
shware_systems (reporter)
2017-04-27 10:34

No, it's still in the queue, but as additional 2 cents...

Proposed Resolution:
After, in nftw()
"The second argument is a pointer to the stat buffer containing information on the object, filled in as if fstatat(), stat(), or lstat() had been called to retrieve the information."

add, same paragraph:
"This argument may be NULL if either fstatat() or stat(), or lstat() if FTW_PHYS set (see above), would return an error rather than valid information. If not NULL and valid information is unavailable the buffer contents are unspecified."

Change FTW_NS to:
"Retrieval of stat information failed on the object because of lack of appropriate permission. Failure of stat() for any other reason is considered an error and nftw() shall return -1 if fn returns 0."

Change FTW_SLN to:
"Retrieval of stat information failed on the object because the object is a symbolic link that does not name an existing file and FTW_PHYS is unset. If fn returns 0 the nftw() function shall return with -1."
-------------------------------------------------------
Use of NULL is already allowed due to the unspecified aspect of the argument with FTW_NS; this just makes it explicit it's possible with any FTW_* type value reporting an error. The addition to FTW_SLN follows from the last sentence of the FTW_NS description, simply making the exit requirement explicit there too and allowing fn to return an application specific code in keeping with the Return Value sections text. The first part of both reflects implementations may use internal means other than stat() to fill the stat structure, so saying "stat() function failed" is restrictive.

The behavior exhibited by the implementations reflects they use the buffer passed to fn both as argument and for path testing purposes, rather than separate ones for invalid argument, stat, and lstat. Moving the buffer status as undefined if retrieval of the appropriate data fails from FTW_NS to the argument description preserves allowing this, but may require some applications to change to reflect leaving lstat data in the buffer is more a bug than feature.

I leave open whether fstatat() page should have something explicit about whether the stat buffer is to be returned unmodified or may be garbage if an error condition occurs.
(0003677)
joerg (reporter)
2017-04-27 10:39
edited on: 2017-04-27 10:40

Re: Note: 0003574

In case it is of interest, the Solaris implementation is the original source code written by David Korn.

(0003785)
mtk (reporter)
2017-06-19 12:54

So, I dug deeper on this issue, and discovered that the Linux/glibc implementation used to the same thing as every other implementation.
See https://bugzilla.redhat.com/show_bug.cgi?id=1422736#c11 [^]

Until glibc 2.3.6, in the io/ftw.c process_entry() code, we find:

  if (((data->flags & FTW_PHYS)
       ? LXSTAT (_STAT_VER, name, &st)
       : XSTAT (_STAT_VER, name, &st)) < 0)
    {
      if (errno != EACCES && errno != ENOENT)
        result = -1;
      else if (!(data->flags & FTW_PHYS)
               && LXSTAT (_STAT_VER, name, &st) == 0
               && S_ISLNK (st.st_mode))
        flag = FTW_SLN;
      else
        flag = FTW_NS;
    }

So, if FTW_PHYS was not set, use stat() on the path. If that fails (because of a dangling symlink, for example), then try lstat() on the path and check if the result is symlink; if so, emit FTW_SLN.

In glibc 2.4 (~2006) things changed to the situation we currently have. The change *appears* to be an unintended regression, since the associated changelog message make no mention of modifying the behavior of FTW_SLN.

So, I do think this is a glibc bug, not a fault in the standard, per se (though the standard could be a little clearer).
(0004074)
geoffclare (manager)
2018-08-09 15:41
edited on: 2018-08-10 11:05

Interpretation response
------------------------
The standard clearly states that the second argument to fn() contains "information on the object", and conforming implementations must conform to this.

Rationale:
-------------
The second bullet item says "The second argument is a pointer to the stat buffer containing information on the object".

The description of FTW_SLN says "The object is a symbolic link that does not name an existing file".

These two things together require that for FTW_SLN, the stat buffer contains information about the symbolic link (which is "the object").

Where the standard states that when FTW_PHYS is clear, symbolic links are followed instead of being reported, naturally they are not followed if the target does not exist and the FTW_SLN constant alone constitutes reporting the symbolic link.

Notes to the Editor (not part of this interpretation):
-------------------------------------------------------
On page 1398 line 46479 section nftw change:

The object is a symbolic link that does not name an existing file.

to:

The object is a symbolic link that does not name an existing file. The stat buffer passed to fn shall contain information on the symbolic link.

On page 1399 line 46543 section nftw change:

<tt>(intmax_t) sb->st_size</tt>

to:

<tt>(intmax_t) ((tflag == FTW_NS) ? -1 : sb->st_size)</tt>

(0004075)
geoffclare (manager)
2018-08-09 15:48

The glibc maintainers may want to consider a solution whereby the extra lstat() call is made only if POSIXLY_CORRECT is set in the environment, thus preserving the optimisation for those applications that do not need strict POSIX conformance.
(0004077)
shware_systems (reporter)
2018-08-09 17:02

I think this was overlooked in the discussion.

Given:
"Where the standard states that when FTW_PHYS is clear, symbolic links are followed instead of being reported, naturally they are not followed if the target does not exist and the FTW_SNL (sp:FTW_SLN) constant alone constitutes reporting the symbolic link."

doesn't this mean the return code can be returned when FTW_PHYS is clear, not just when set as noted at Line 46480, so that parenthetical expression should be deleted there?

I realize this may break code that only checks for FTW_SLN when FTW_PHYS is set, but otherwise isn't the alternative, per FTW_NS, that it's a non-permissions related stat() error (ENOENT vs. EPERM) and the interface should abort without calling fn() at all?
(0004078)
shware_systems (reporter)
2018-08-09 17:57

Suggested alternate to the line 46479 change, per Note 4077...

Replace, starting at Line 46474:
FTW_NS The stat() function failed on the object because of lack of appropriate
permission. The stat buffer passed to fn is undefined. Failure of stat() for any
other reason is considered an error and nftw() shall return −1.
FTW_SL The object is a symbolic link. (This condition shall only occur if the
FTW_PHYS flag is included in flags.)
FTW_SLN The object is a symbolic link that does not name an existing file. (This
condition shall only occur if the FTW_PHYS flag is not included in flags.)

with either:
FTW_NS The stat() function failed on the object because of lack of appropriate
permission, including permissions involved in evaluating any followed symbolic link.
The stat buffer passed to fn is undefined.
FTW_SLN The object is a symbolic link that does not name an existing file. The stat
buffer passed to fn shall contain information on the symbolic link as returned by
lstat( ) or the equivalent.
FTW_SL The object is a symbolic link. The stat buffer passed to fn shall contain
information on the symbolic link as returned by lstat() or the equivalent. (This
condition shall only occur if the FTW_PHYS flag is included in flags.)

Failure of fstatat(), stat(), or lstat() for any other reason is considered an
error and nftw() shall return −1 without calling fn().

or:
FTW_NS The stat() function failed on the object because of lack of appropriate
permission. The stat buffer passed to fn is undefined.
Additionally, if the FTW_PHYS flag is included in flags:
FTW_SL The object is a symbolic link that names an accessible link target.
FTW_SLN The object is a symbolic link that does not name an existing file.
For both of these the stat buffer passed to fn shall contain information on the
symbolic link as returned by lstat() or the equivalent.

Failure of fstatat(), stat( ), or lstat() for any other reason is considered an
error and nftw() shall return −1 without calling fn().
---------------------------------------------
The first option is consistent with the current bug resolution, the second with the current text of the stndard.
Both make it explicit that other errors cause an abort, so it doesn't matter what the stat buffer holds, not
hide it as part of the FTW_NS description and implying fn() is supposed to be called before the abort.
(0004613)
agadmin (administrator)
2019-10-07 15:18

Interpretation proposed: 7 October 2019
(0004657)
agadmin (administrator)
2019-11-11 12:21

Interpretation Approved: 11 Nov 2019

- Issue History
Date Modified Username Field Change
2017-02-24 18:47 djdelorie New Issue
2017-02-24 18:47 djdelorie Status New => Under Review
2017-02-24 18:47 djdelorie Assigned To => ajosey
2017-02-24 18:47 djdelorie Name => DJ Delorie
2017-02-24 18:47 djdelorie Organization => Red Hat Inc
2017-02-24 18:47 djdelorie User Reference => https://bugzilla.redhat.com/show_bug.cgi?id=1422736 [^]
2017-02-24 18:47 djdelorie Section => nftw()
2017-02-24 18:47 djdelorie Page Number => http://pubs.opengroup.org/onlinepubs/9699919799/ [^]
2017-02-24 18:47 djdelorie Line Number => n/a
2017-02-24 20:38 shware_systems Note Added: 0003570
2017-02-25 03:06 carlos Note Added: 0003571
2017-02-26 19:59 mtk Note Added: 0003574
2017-02-27 09:50 geoffclare Project 1003.1(2008)/Issue 7 => 1003.1(2016/18)/Issue7+TC2
2017-04-26 17:06 djdelorie Note Added: 0003675
2017-04-27 10:34 shware_systems Note Added: 0003676
2017-04-27 10:39 joerg Note Added: 0003677
2017-04-27 10:40 joerg Note Edited: 0003677
2017-06-19 12:54 mtk Note Added: 0003785
2018-08-09 15:41 geoffclare Note Added: 0004074
2018-08-09 15:42 geoffclare Interp Status => Pending
2018-08-09 15:42 geoffclare Final Accepted Text => Note: 0004074
2018-08-09 15:42 geoffclare Status Under Review => Interpretation Required
2018-08-09 15:42 geoffclare Resolution Open => Accepted As Marked
2018-08-09 15:42 geoffclare Tag Attached: tc3-2008
2018-08-09 15:48 geoffclare Note Added: 0004075
2018-08-09 17:02 shware_systems Note Added: 0004077
2018-08-09 17:57 shware_systems Note Added: 0004078
2018-08-10 11:05 geoffclare Note Edited: 0004074
2019-10-07 15:18 agadmin Interp Status Pending => Proposed
2019-10-07 15:18 agadmin Note Added: 0004613
2019-11-11 12:21 agadmin Interp Status Proposed => Approved
2019-11-11 12:21 agadmin Note Added: 0004657
2019-12-04 11:32 geoffclare Status Interpretation Required => Applied


Mantis 1.1.6[^]
Copyright © 2000 - 2008 Mantis Group
Powered by Mantis Bugtracker