Austin Group Defect Tracker

Aardvark Mark IV


Viewing Issue Simple Details Jump to Notes ] Issue History ] Print ]
ID Category Severity Type Date Submitted Last Update
0001439 [1003.1(2016/18)/Issue7+TC2] Base Definitions and Headers Editorial Clarification Requested 2020-12-24 03:47 2021-01-16 11:27
Reporter dannyniu View Status public  
Assigned To
Priority normal Resolution Open  
Status New  
Name DannyNiu/NJF
Organization
User Reference
Section <dlfcn.h>, <sys/mman.h>
Page Number many
Line Number many
Interp Status ---
Final Accepted Text
Summary 0001439: The POSIX standard did not distinguish object and function pointers as did the C standard.
Description The C standard distinguishes function pointers from object pointers. The POSIX standard treat them as the same in <dlfcn.h> and in <sys/mman.h> where the return type of `dlsym` and `mmap` is `void *`.

However, the specification of ILP32 and LP64 memory models can be seen as implying the standard is treating function and object pointers as the same.
Desired Action Explicitly state that function and object pointers are the same, and share the same process address space.
Tags No tags attached.
Attached Files

- Relationships
related to 0000074Closedajosey 1003.1(2008)/Issue 7 Pointer Types Problem 

-  Notes
(0005189)
geoffclare (manager)
2021-01-04 10:49
edited on: 2021-01-04 10:50

This is effectively requesting that we revert to the situation before bug 0000074 was applied in Issue7 TC1. Prior to that, XSH7 included this section:
2.12.3 Pointer Types

All function pointer types shall have the same representation as the type pointer to void. Conversion of a function pointer to void * shall not alter the representation. A void * value resulting from such a conversion can be converted back to the original function pointer type, using an explicit cast, without loss of information.

Note: The ISO C standard does not require this, but it is required for POSIX conformance.

This had been added in the original Issue 7 because of dlsym(), but the decision made for bug 74 was that we should instead have a much narrower requirement specific to the conversions needed for dlsym() rather than this very general requirement.

This new bug also mentions mmap(), which I don't recall coming up before in this context. If there is existing application code that casts the mmap() return value to a function pointer, then we should consider adding an mmap() requirement similar to the current dlsym() one.

(0005195)
dannyniu (reporter)
2021-01-10 11:03
edited on: 2021-01-10 11:05

If the decision of bug-0074 was made based on security, then I find it not very meaningful, as modern computers are capable of enforcing code/data access control in hardware, and many operating systems are also capable of emulating such access control in software should hardware support is lacking.

The other point which says:

> it need not necessarily be supported to try to examine
> the instructions that make a function by ...

While I can't fully understand it (probably due to my knowledge of English being limited), I think such examination is more of a feature.

Lastly, the "tiny code generator" (TCG) accelerator in QEMU is known to use mmap. I'm not sure if it actually acquires executable memory pages and calls function residing in it, but being a "code generator", I think it almost certainly does that. Although this last point isn't very compelling as QEMU is all about things that are hardware-specific (and thus implementation-specific).

My reason for reverting the decision of bug-0074 would only be for desiring an intuitive, simple, and straightforward memory model and type system.

(0005196)
joerg (reporter)
2021-01-10 14:53
edited on: 2021-01-10 16:31

I don't believe that there is a need to change the mmap() requirements,
since the related tricks that may be needed could be hidden behind dlopen().

The mentioned change was needed to permit dlsym() to work both for
functions and data objects. The reason however was not security or
address space division but some older IBM mainframes had a different
bit-ness for code and data. The change requires the provider of a
POSIX platform to define the type void * to be useful for both,
function pointers and data pointers, while the C standard only requires
void * to work for data pointers.

I don't believe there is a need to change the current POSIX text
unless you like to explicitly permit mmap() to be useful for
self-managed mapped code segments. Even with that background, I cannot
see a problem, since mmap() already returns void * and there is PROT_EXEC.

(0005197)
shware_systems (reporter)
2021-01-10 16:29

It is because of those hardware access controls that the need arises to distinguish between code and object pointers. With some processors, for each executable, a code pointer and an object pointer may have different bit widths. This is notably evident with the Intel 80286 processors using various memory models. A code pointer can be 32 bits, object pointers may only be 16 bits wide. For another file code pointers may be 16 bits, and object pointers may be 32. To support dlsym() this requires the storage for a void * to be 32 bits to be able to hold either possibility.

The other salient aspect to these is only object pointers are writable. Any data objects pointed to by code pointers is const qualified, enforced by the hardware on store dereferences if code attempts it. Because they can not be modified, there is no reason they need to be readable either. A compiler can as easily produce a reference to an area reserved for objects as into a function body and put the data there, after all. It is up to the compiler to manage function entry points and branch labels as rvalues only.

From the standard's perspective the only interfaces that should be modifying code pointers are the exec family and dlopen(), as trusted code doing load time relocation calculations. Any other code has to be presumed to have malicious intent in even trying to read a function body, as the security consideration.

While there are classes of applications where this isn't the intent, such as in-situ debuggers, the specification of these is considered out of scope for the standard. This is because they are so dependent on hardware knowledge and unspecified implementation support for them.
(0005202)
dannyniu (reporter)
2021-01-16 11:27

<signal.h> describes siginfo_t and claims it has the following member:

void *si_addr // Address of faulting instruction

- Issue History
Date Modified Username Field Change
2020-12-24 03:47 dannyniu New Issue
2020-12-24 03:47 dannyniu Name => DannyNiu/NJF
2020-12-24 03:47 dannyniu Section => <dlfcn.h>, <sys/mman.h>
2020-12-24 03:47 dannyniu Page Number => many
2020-12-24 03:47 dannyniu Line Number => many
2021-01-04 10:49 geoffclare Note Added: 0005189
2021-01-04 10:50 geoffclare Relationship added related to 0000074
2021-01-04 10:50 geoffclare Note Edited: 0005189
2021-01-10 11:03 dannyniu Note Added: 0005195
2021-01-10 11:05 dannyniu Note Edited: 0005195
2021-01-10 14:53 joerg Note Added: 0005196
2021-01-10 16:29 shware_systems Note Added: 0005197
2021-01-10 16:31 joerg Note Edited: 0005196
2021-01-16 11:27 dannyniu Note Added: 0005202


Mantis 1.1.6[^]
Copyright © 2000 - 2008 Mantis Group
Powered by Mantis Bugtracker