Austin Group Defect Tracker

Aardvark Mark IV


Viewing Issue Simple Details Jump to Notes ] Issue History ] Print ]
ID Category Severity Type Date Submitted Last Update
0000576 [1003.1(2008)/Issue 7] Base Definitions and Headers Objection Omission 2012-06-03 03:16 2013-11-04 08:57
Reporter Love4Boobies View Status public  
Assigned To ajosey
Priority normal Resolution Open  
Status Under Review  
Name Bogdan Barbu
Organization
User Reference
Section http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/unistd.h.html [^]
Page Number Not sure
Line Number Not sure
Interp Status ---
Final Accepted Text
Summary 0000576: No format specifiers for several <sys/types.h> types.
Description Several types defined in <sys/types.h> don't have format specifiers to be used with printf and friends; it is only mentioned that they are integers and (for some), whether they are signed or unsigned. The most important of these is probably id_t, to which pid_t's, uid_t's, and gid_t's could be cast. There might be other similar problems throughout the standard.
Desired Action Similarly to what <inttypes.h> does, add the following to the <unistd.h> description:

The <unistd.h> header shall define the following macros. Each expands to a character string literal containing a conversion specifier, possibly modified by a length modifier, suitable for use within the format argument of a formatted input/output function when converting the corresponding integer type. These macros have the general form of PRI (character string literals for the fprintf() and fwprintf() family of functions), followed by a name corresponding to a similar type in <sys/types.h>. The macros are:

...
PRIid
...
Tags No tags attached.
Attached Files

- Relationships

-  Notes
(0001251)
lacos (reporter)
2012-06-03 10:44

Signedness of such a type (if the standard doesn't specify it) can be determined by checking "0 < (type)-1". The RHS (before being promoted possibly) will be positive or negative. Any RHS promotion, or conversion of any side for the comparison, is value preserving.

Once signedness is known, printing these types is possible by converting them first to intmax_t / uintmax_t.

Parsing is more problematic:

- The scanf() family can't be used with untrusted input, thus the SCN macros would have limited applicability.

- strtoumax() handles negation (subject sequence beginning with a minus sign) automatically. (This can be prevented by finding the subject sequence and checking for the sign manually.)

- before converting the return value of strtoimax() to the signed integer target type (like off_t), a range check would be necessary, but that's not easy without OFF_MAX.

(SUSv4 seems to require two's complement representation (see the rationale in stdint.h). I'm not sure if padding bits are generally banned, but if they are, the following (naive) expression should yield the maximum value (as uintmax_t) for the signed target type:

UINTMAX_MAX >> (sizeof(uintmax_t) - sizeof(signed_target_type)) * CHAR_BIT + 1
(0001253)
Love4Boobies (reporter)
2012-06-06 17:45
edited on: 2012-06-06 17:46

Well, pid_t is a signed type guaranteed to be no wider than long so it can just be cast to that. For uid_t and gid_t, all that is guaranteed is that they are integer types and the solution is a huge hack. I was unable to find any mention of padding bits being unallowed (other than for the (u)intN_t types, of course), meaning that something that ought to be done at compile time must be done at run time.

(0001257)
eblake (manager)
2012-06-14 16:01

Of a broader scope, it might be worth standardizing a series of macros for querying type properties; for example, gnulib has written a compile-time TYPE_MAXIMUM(type) macro that works with 2's-complement, 1's-compliment, and signed-magnitude integer representations:
git.savannah.gnu.org/gitweb/?p=gnulib.git;a=blob;f=lib/intprops.h

It might also be possible to standardize a printf/scanf format specifier that says that the width of an argument is passed alongside the argument, something like: printf("%I*d", sizeof(pid_t), (pid_t)value), where the %I* modifier says to consume a size_t argument 2 that will then determine what size to use for argument 3.
(0001446)
Love4Boobies (reporter)
2013-01-17 16:30
edited on: 2013-01-17 16:31

Since no conclusion has been reached, I propose that a new (generic) format specifier be included as an extension to the printf family of functions. In the spirit of other format specifiers, programmers could use it to encode the appropriate information about the type it is dealing with. Is this acceptable?

(0001449)
Don Cragun (manager)
2013-01-18 00:12

In response to Note: 0001446: Please propose a generic format that can be used as a compatible extension to the C Standard's printf/scanf format specifiers to allow printing the types defined in <sys/types.h> and to encode the type of value used when specifying field widths for any type to be an off_t, a ptrdiff_t, or a size_t that is compatible with existing extensions used by GNU's glibc's and by AT&T Research's AST library's printf format specifiers. From the more that 75 messages on this topic in the Austin Group's mail discussion log, it should be obvious that there is no simple answer here.
(0001956)
shware_systems (reporter)
2013-11-04 08:57

From a portability standpoint I think this is a decent idea. I haven't read the more than 75 messages, but I'm wondering what the major difficulty is... The problem seems to be how do you spec a format field that's just wide enough to not lose any precision. All the needed output format specifiers for those based on integer types are already part of <inttypes.h>. To keep things simpler I'd put these new PRIxx macros in <sys/types.h>, not <unistd.h>, and have them be aliases of the relevant macros in <inttypes.h> according to which programming environment the application is being compiled for.

That way the implementation handles varying width requirements, as it knows what they are, and the application doesn't need to calculate them or apply problematic casts. This would guarantee none of the relevant fields was too small, at least, and once that's done 'pretty printing' can be handled by the application using a generic justification routine with each column of output. The compiler would handle the implicit casting to integer to match the specifier. The caveat "Inclusion of <sys/types.h> may make all symbols from <inttypes.h> visible" and appropriate cross references would be expected. This can be a bit messy in what ifdefs are needed in the header, but in use it would be simple enough. If an application doesn't care about pretty it can use the format macros with printf directly. It does care it uses sprintf to preprocess them and '%s' in the printf format. If needed the value of a format macro can be decomposed by the application to ascertain base type and assigned widths as part of preprocessing. A splitfmtspec() interface similar to splitpath() could help with that.

About the only new format specifier I see missing, albeit a complicated one to implement, would be one that takes the reference of the parameter and formats that as a hex value, where the type is an alias of a structure or array definition, and formats the value as hex when the type is an alias of a pointer or ptrdiff_t. Different letters for 'offset or base only' and 'base plus offset' pointers may be needed for some systems, but POSIX only would need the 'offset or base only' format. It's simpler to compile, definitely. This would differ from the 'p' and 'X' or 'x' format in that it would have some type sensitivity, hex chars would be required, and a '0' or 'F' fill could be used with ptrdiff_t signed offsets, and is more explicit and portable than 'p', as entirely implementation defined.

Where C needs to stay pretty generic with 'p', to provide nominal support for various memory architectures, this is plausible for POSIX because it requires a flat address space. Properly spec'd out I think it would be at least a CX candidate. Even in POSIX 'base plus offset' form would be a way of showing structure member addresses, if the parameter value was an explicit reference to one. I'm not referencing size_t and off_t because this is a POSIX specific pointer format and those are integer types that can be handled as above.

- Issue History
Date Modified Username Field Change
2012-06-03 03:16 Love4Boobies New Issue
2012-06-03 03:16 Love4Boobies Status New => Under Review
2012-06-03 03:16 Love4Boobies Assigned To => ajosey
2012-06-03 03:16 Love4Boobies Name => Bogdan Barbu
2012-06-03 03:16 Love4Boobies Section => http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/unistd.h.html [^]
2012-06-03 03:16 Love4Boobies Page Number => Not sure
2012-06-03 03:16 Love4Boobies Line Number => Not sure
2012-06-03 10:44 lacos Note Added: 0001251
2012-06-06 17:45 Love4Boobies Note Added: 0001253
2012-06-06 17:46 Love4Boobies Note Edited: 0001253
2012-06-14 16:01 eblake Note Added: 0001257
2013-01-17 16:30 Love4Boobies Note Added: 0001446
2013-01-17 16:30 Love4Boobies Note Edited: 0001446
2013-01-17 16:31 Love4Boobies Note Edited: 0001446
2013-01-18 00:12 Don Cragun Note Added: 0001449
2013-11-04 08:57 shware_systems Note Added: 0001956


Mantis 1.1.6[^]
Copyright © 2000 - 2008 Mantis Group
Powered by Mantis Bugtracker