Austin Group Defect Tracker

Aardvark Mark IV

Viewing Issue Simple Details Jump to Notes ] Issue History ] Print ]
ID Category Severity Type Date Submitted Last Update
0000697 [1003.1(2013)/Issue7+TC1] System Interfaces Comment Clarification Requested 2013-05-15 10:31 2013-05-30 15:57
Reporter steffen View Status public  
Assigned To
Priority normal Resolution Open  
Status New  
Name Steffen Nurpmeso
User Reference
Section none
Page Number none
Line Number none
Interp Status ---
Final Accepted Text
Summary 0000697: Adding of a getdirentries() function
Description POSIX systems don't follow the Unix paradigm "everything is a file" for directories in that it is impossible to simply open(2) a directory and read(2) the descriptor to get access to the list of direntries therein.

In the current POSIX standard there is no way to gain access to directory entries but by using the completely intransparent (in respect to internal resource usage; though today there is now fdopendir(3), but still) and unpredictable ("directory status snapshot") opendir(3) / readdir(3) / closedir(3) family of functions.

On the other hand all (to the best of my knowledge) POSIX systems support some kind of getdents(2), getdirentries(2) or getdirent(2) system call (with, despite their names, almost identical semantics).
This functionality can be used to fully control reading of directory entries, at least in respect to resource usage, and including the possibility to perform bulk reads. A publically accessible example would be [1].

[1] [^]
Desired Action Add a getdirentries(2) function to the standard, to catch up handling of directories with handling of files (e.g., open(2) / getdirentries(2) / close(2) <=> open(2) / read(2) / close(2)).
Or nay.
Tags No tags attached.
Attached Files

- Relationships
related to 0000696New either NAME_MAX shouldn't be optional, or readdir_r() needs clarification 

-  Notes
jilles (reporter)
2013-05-15 22:08

With common implementation methods, all state for objects referenced by file descriptors has to be kept in the kernel (file descriptors may persist across execve() which replaces the userspace memory and may be sent across sockets). I think this should only be forced when there is a benefit to it. I don't see such a benefit here. The getdirentries() function does not make directory file descriptors more interchangeable with other file descriptors because it is specific to directories.

A concrete example of state for a directory stream occurs with stacked union filesystems. The list of directory entries in the union consists of all directory entries in the upper layer and those directory entries in the lower layer that do not match an entry in the upper layer by name. A possible implementation strategy (used in FreeBSD) is to remove the duplicates in opendir() and readdir(). This avoids storing the entire upper directory in the kernel, or reading it repeatedly whenever a block of the lower directory is read. However, an application that calls getdirentries() directly may see duplicates.

A simple solution is to call union filesystems non-compliant but this does not help the real world where people use them and expect applications to cope.
steffen (reporter)
2013-05-16 10:22

Hmm. But this concrete example shows the subtleties of filesystems that are layed upon each other rather than being an objection against getdirentries(2). In fact this is a bit of a pity because i didn't post a function text yesterday and wanted to make it up and use the FreeBSD version of getdirentries(2) since it would allow the most flexible definition that should work out-of-the-box on all systems (despite using off_t not long etc.; and stating that the off_t value is transparent to users).

But when i look at opendir.c (readdir.c) of FreeBSD 10.0 (last modified by your commit b4ce52f6), then it seems that you talk about features at the very first implementation iterations, just fixed so that it at least (?) works.
You use the FreeBSD getdirentries(2) (that luckily exists :) to read in all the entries in case of MNT_UNION or f_fstypename=="unionfs" and loop over that "atomic" snapshot to modify the entries to fixup things that possibly should better be done in the kernel.

Since, well, the current code doesn't even affect readdir(3); i.e., the snapshot from opendir(3) will survive a possibly infinite amount of rewinddir(3) / readdir(3) iterations, without ever being updated!
So, for this to be sane, you need at least some kind of "modification-counter-stamp" interaction with the kernel; that is, imho -- POSIX doesn't seem to state anything about the "freshness" of a DIR* directory stream, today.
... And i don't even know how the stack behaves if the surviving duplicate is removed / moved / whatever, whereas the removed one remains. I.e., it's anyway a snapshot, but it seems to me that the kernel / the union FS must be capable of dealing with it.
There are operating systems which removed all kind of stacked filesystems due to the race that are implied by such implementations. (I don't use them, but could not live without a singly-stacked mount_nullfs(8).)
geoffclare (manager)
2013-05-30 15:57

In the May 30th teleconference we agreed that a function of this type would be a good addition to the standard. New interfaces require a sponsor, but it is possible that the Base Working Group might be willing to do that.

It was felt that none of the three existing functions is a perfect fit for standardization; the favored approach would be to add a new function that has the best features chosen from the three.

- Issue History
Date Modified Username Field Change
2013-05-15 10:31 steffen New Issue
2013-05-15 10:31 steffen Name => Steffen Nurpmeso
2013-05-15 10:31 steffen Section => none
2013-05-15 10:31 steffen Page Number => none
2013-05-15 10:31 steffen Line Number => none
2013-05-15 22:08 jilles Note Added: 0001607
2013-05-16 10:22 steffen Note Added: 0001608
2013-05-30 15:37 eblake Relationship added related to 0000696
2013-05-30 15:57 geoffclare Note Added: 0001629
2014-03-30 00:33 sstewartgallus Issue Monitored: sstewartgallus

Mantis 1.1.6[^]
Copyright © 2000 - 2008 Mantis Group
Powered by Mantis Bugtracker