|Anonymous | Login||2021-01-16 06:09 UTC|
|Main | My View | View Issues | Change Log | Docs|
|Viewing Issue Simple Details|
|ID||Category||Severity||Type||Date Submitted||Last Update|
|0001439||[1003.1(2016/18)/Issue7+TC2] Base Definitions and Headers||Editorial||Clarification Requested||2020-12-24 03:47||2021-01-10 16:29|
|Final Accepted Text|
|Summary||0001439: The POSIX standard did not distinguish object and function pointers as did the C standard.|
The C standard distinguishes function pointers from object pointers. The POSIX standard treat them as the same in <dlfcn.h> and in <sys/mman.h> where the return type of `dlsym` and `mmap` is `void *`.
However, the specification of ILP32 and LP64 memory models can be seen as implying the standard is treating function and object pointers as the same.
|Desired Action||Explicitly state that function and object pointers are the same, and share the same process address space.|
|Tags||No tags attached.|
edited on: 2021-01-04 10:50
This is effectively requesting that we revert to the situation before bug 0000074 was applied in Issue7 TC1. Prior to that, XSH7 included this section:
2.12.3 Pointer Types
This had been added in the original Issue 7 because of dlsym(), but the decision made for bug 74 was that we should instead have a much narrower requirement specific to the conversions needed for dlsym() rather than this very general requirement.
This new bug also mentions mmap(), which I don't recall coming up before in this context. If there is existing application code that casts the mmap() return value to a function pointer, then we should consider adding an mmap() requirement similar to the current dlsym() one.
edited on: 2021-01-10 11:05
If the decision of bug-0074 was made based on security, then I find it not very meaningful, as modern computers are capable of enforcing code/data access control in hardware, and many operating systems are also capable of emulating such access control in software should hardware support is lacking.
The other point which says:
> it need not necessarily be supported to try to examine
> the instructions that make a function by ...
While I can't fully understand it (probably due to my knowledge of English being limited), I think such examination is more of a feature.
Lastly, the "tiny code generator" (TCG) accelerator in QEMU is known to use mmap. I'm not sure if it actually acquires executable memory pages and calls function residing in it, but being a "code generator", I think it almost certainly does that. Although this last point isn't very compelling as QEMU is all about things that are hardware-specific (and thus implementation-specific).
My reason for reverting the decision of bug-0074 would only be for desiring an intuitive, simple, and straightforward memory model and type system.
edited on: 2021-01-10 16:31
I don't believe that there is a need to change the mmap() requirements,
since the related tricks that may be needed could be hidden behind dlopen().
The mentioned change was needed to permit dlsym() to work both for
functions and data objects. The reason however was not security or
address space division but some older IBM mainframes had a different
bit-ness for code and data. The change requires the provider of a
POSIX platform to define the type void * to be useful for both,
function pointers and data pointers, while the C standard only requires
void * to work for data pointers.
I don't believe there is a need to change the current POSIX text
unless you like to explicitly permit mmap() to be useful for
self-managed mapped code segments. Even with that background, I cannot
see a problem, since mmap() already returns void * and there is PROT_EXEC.
It is because of those hardware access controls that the need arises to distinguish between code and object pointers. With some processors, for each executable, a code pointer and an object pointer may have different bit widths. This is notably evident with the Intel 80286 processors using various memory models. A code pointer can be 32 bits, object pointers may only be 16 bits wide. For another file code pointers may be 16 bits, and object pointers may be 32. To support dlsym() this requires the storage for a void * to be 32 bits to be able to hold either possibility.
The other salient aspect to these is only object pointers are writable. Any data objects pointed to by code pointers is const qualified, enforced by the hardware on store dereferences if code attempts it. Because they can not be modified, there is no reason they need to be readable either. A compiler can as easily produce a reference to an area reserved for objects as into a function body and put the data there, after all. It is up to the compiler to manage function entry points and branch labels as rvalues only.
From the standard's perspective the only interfaces that should be modifying code pointers are the exec family and dlopen(), as trusted code doing load time relocation calculations. Any other code has to be presumed to have malicious intent in even trying to read a function body, as the security consideration.
While there are classes of applications where this isn't the intent, such as in-situ debuggers, the specification of these is considered out of scope for the standard. This is because they are so dependent on hardware knowledge and unspecified implementation support for them.
|2020-12-24 03:47||dannyniu||New Issue|
|2020-12-24 03:47||dannyniu||Name||=> DannyNiu/NJF|
|2020-12-24 03:47||dannyniu||Section||=> <dlfcn.h>, <sys/mman.h>|
|2020-12-24 03:47||dannyniu||Page Number||=> many|
|2020-12-24 03:47||dannyniu||Line Number||=> many|
|2021-01-04 10:49||geoffclare||Note Added: 0005189|
|2021-01-04 10:50||geoffclare||Relationship added||related to 0000074|
|2021-01-04 10:50||geoffclare||Note Edited: 0005189|
|2021-01-10 11:03||dannyniu||Note Added: 0005195|
|2021-01-10 11:05||dannyniu||Note Edited: 0005195|
|2021-01-10 14:53||joerg||Note Added: 0005196|
|2021-01-10 16:29||shware_systems||Note Added: 0005197|
|2021-01-10 16:31||joerg||Note Edited: 0005196|
|Mantis 1.1.6[^] Copyright © 2000 - 2008 Mantis Group|