Anonymous | Login | 2021-01-21 07:33 UTC |
Main | My View | View Issues | Change Log | Docs |
Viewing Issue Simple Details [ Jump to Notes ] | [ Issue History ] [ Print ] | |||||||||||
ID | Category | Severity | Type | Date Submitted | Last Update | |||||||
0001387 | [1003.1(2008)/Issue 7] System Interfaces | Editorial | Clarification Requested | 2020-08-10 16:46 | 2020-08-20 19:48 | |||||||
Reporter | rhansen | View Status | public | |||||||||
Assigned To | ajosey | |||||||||||
Priority | normal | Resolution | Open | |||||||||
Status | Under Review | |||||||||||
Name | Richard Hansen | |||||||||||
Organization | ||||||||||||
User Reference | ||||||||||||
Section | malloc | |||||||||||
Page Number | 1295 (Issue 7 2018 edition) | |||||||||||
Line Number | 43161 (Issue 7 2018 edition) | |||||||||||
Interp Status | --- | |||||||||||
Final Accepted Text | ||||||||||||
Summary | 0001387: Should EAGAIN be acceptable for malloc failure? | |||||||||||
Description |
On failure, the implementations of malloc from Solaris, OpenSolaris, OpenIndiana, illumos, etc. set errno to either ENOMEM or EAGAIN (see https://illumos.org/man/3c/malloc [^] ). It seems to me that EAGAIN is used for at least a subset of ordinary out-of-memory conditions, so on the surface these implementations appear to be non-conforming. I am unfamiliar with the implementation details, and the phrase "Insufficient storage space is available" could be interpreted in a few subtly different ways, so perhaps one could argue that EAGAIN is only used for error cases other than "Insufficient storage space is available" (which is permitted by the standard). IIUC, Solaris has behaved this way for a very long time. If the implementations are considered to be non-conforming, then it might make more sense to change the standard to permit EAGAIN than to change the implementations. Link to HTML version of Issue 7 2018 edition: https://pubs.opengroup.org/onlinepubs/9699919799/functions/malloc.html [^] |
|||||||||||
Desired Action |
On line page 1295 line 43161, change: [ENOMEM] Insufficient storage space is available. to: [ENOMEM], [EAGAIN] Insufficient storage space is available. A similar change would be required for every other function that allocates memory, assuming the implementations of those functions on Solaris and friends use malloc and leave errno unmodified on malloc failure. |
|||||||||||
Tags | No tags attached. | |||||||||||
Attached Files | ||||||||||||
|
![]() |
|
(0004914) rhansen (manager) 2020-08-10 17:09 |
A problem arises if we were to replace all instances of [ENOMEM] with [ENOMEM], [EAGAIN]: Some functions already specify [EAGAIN] for other error conditions. These include:
|
(0004915) Don Cragun (manager) 2020-08-10 17:39 |
Rather than: [ENOMEM], [EAGAIN] Insufficient storage space is available. I would prefer to see: [EAGAIN] Allocating the requested storage space would cause the thread to be blocked. [ENOMEM] Insufficient storage space is available. |
(0004916) alanc (reporter) 2020-08-10 17:45 edited on: 2020-08-10 17:45 |
I believe the Solaris behavior originally came from passing through sbrk() failures without checking/changing the reported errno value, and thus distinguishes between hitting some limit, for which trying again is not worthwhile, vs. waiting for other processes to exit or otherwise free up memory. |
(0004918) shware_systems (reporter) 2020-08-11 01:08 |
EAGAIN, if added as Don proposes, should be a may fail case, not both as shall fail. As ENOMEM is currently synonymous with a null return with most platforms, adding a symbolic non-NULL ptr, i.e. AGAIN_PTR, would be in keeping with the C standard only using the return value to indicate errors. The implementation-defined errno possible with 0 size allocs also needs wording that it doesn't conflict with either of these, as well. |
(0004923) joerg (reporter) 2020-08-14 09:50 edited on: 2020-08-14 09:51 |
Re:Note: 0004916 Hi Alan, do you know where EAGAIN is created in the Solaris kernel while running sbrk()? Due to the object oriented design of the address space administration in SunOS, it is hard to find the right location in the code. What I however can say is that since SunOS-4.0 (from late 1987), kmem_alloc() with a flag of KM_NOSLEEP returns EAGAIN in case that the operation would result in a sleep. The anon pages segment driver vm_anon.c however only returns ENOMEM, the sgvn driver seg_vn.c returns ENOMEM in plenty of cases (e.g. with shared mappings that are not expected to apply to sbrk()), but also ENOMEM in other cases. |
(0004927) Konrad_Schwarz (reporter) 2020-08-19 09:53 |
Shouldn't this fall under the following provision in https://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_03? [^] "Implementations may support additional errors not included in this list, may generate errors included in this list under circumstances other than those described here, or may contain extensions or limitations that prevent some errors from occurring." |
(0004932) rhansen (manager) 2020-08-19 19:57 |
> Shouldn't this fall under the following provision in [...] Yes, if you interpret illumos's EAGAIN case to be distinct from "insufficient storage space is available." To me, it doesn't feel any different, but I can see how others would feel otherwise. If I saw Don's suggested wording (Note: 0004915) in the standard, then I would interpret "insufficient storage space is available" more narrowly than I do now. |
(0004933) Konrad_Schwarz (reporter) 2020-08-20 07:23 |
But isn't this a distinction without a difference? In vast majority of cases, resolution of the EAGAIN error depends on the actions of other processes in the system over which the application has no control. For fork(), I think a case can be made that in Unix, many processes are short lived and therefore a retry after a limited period can be worthwhile (given a fixed-size process table). Similar for open file descriptors. For memory, in a system with long-lived, memory hogging processes (e.g., large databases), retry after memory allocation failure does not seem worthwhile. If POSIX lists EAGAIN as an alternative for ENOMEM everywhere ENOMEM is documented, scrupulous application programmers will have to handle EAGAIN explicitly, presumably differently from the ENOMEM case (i.e., by sleeping and then retrying). As it is now, if malloc() returns EAGAIN, this would be handled by fully-conforming code under the "additional errors" case, e.g., with perror(). This gives the system administrator the input that more memory or more swap space needs to be installed -- it's not something that the application can handle in a useful way. |
(0004935) rhansen (manager) 2020-08-20 17:58 edited on: 2020-08-20 18:10 |
> But isn't this a distinction without a difference? I agree with your analysis. That's why it feels wrong to me that Solaris-based systems use EAGAIN—they should use ENOMEM instead. I see these options:
|
(0004936) alanc (reporter) 2020-08-20 18:14 |
There is an open bug report against Solaris for this: 15109791 malloc(3C) fails with EAGAIN where ENOMEM is expected and it could be fixed if needed, it's just never been a high priority since few applications do anything differently for EAGAIN vs. ENOMEM return values - most just print the error message from them. That bug notes the historical distinction was: ENOMEM The physical limits of the system are exceeded by size bytes of memory which cannot be allocated. EAGAIN There is not enough memory available to allocate size bytes of memory; but the application could try again later. but also notes Solaris has multiple malloc implementations, and not all of them made this distinction. (We're currently up to 8 different malloc library options in Solaris 11.4 - see the ALTERNATIVE IMPLEMENTATIONS section at the end of: https://docs.oracle.com/cd/E88353_01/html/E37843/malloc-3c.html#scrolltoc [^] .) |
(0004937) rhansen (manager) 2020-08-20 18:24 edited on: 2020-08-20 19:44 |
"So you're telling me there's a chance. YEAH!" :) (reference: https://www.imdb.com/title/tt0109686/quotes/qt0995799 [^] ) Joking aside, it's good news that Oracle isn't totally opposed to changing the behavior. |
(0004938) rhansen (manager) 2020-08-20 18:30 |
Is that Solaris bug report publicly accessible? |
(0004939) alanc (reporter) 2020-08-20 19:42 |
Sorry, Oracle only makes Solaris bug reports accessible to customers with support contracts, not to the general public. (And yes, there's a chance, given a good reason, but that would only affect future support updates to Solaris 11.4, not decades of past releases that have had this behavior in.) |
(0004940) rhansen (manager) 2020-08-20 19:48 edited on: 2020-08-20 19:49 |
> there's a chance, given a good reason Do you know if POSIX conformance would be a good enough reason by itself? > that would only affect future support updates to Solaris 11.4, > not decades of past releases that have had this behavior From a POSIX perspective that's OK—only changing new releases is good enough to avoid rewording the standard. |
Mantis 1.1.6[^] Copyright © 2000 - 2008 Mantis Group |