Anonymous | Login | 2024-09-12 22:36 UTC |
Main | My View | View Issues | Change Log | Docs |
Viewing Issue Simple Details [ Jump to Notes ] | [ Issue History ] [ Print ] | ||||||
ID | Category | Severity | Type | Date Submitted | Last Update | ||
0001731 | [1003.1(2016/18)/Issue7+TC2] System Interfaces | Objection | Clarification Requested | 2023-05-23 09:43 | 2024-06-11 09:07 | ||
Reporter | geoffclare | View Status | public | ||||
Assigned To | |||||||
Priority | normal | Resolution | Accepted As Marked | ||||
Status | Closed | ||||||
Name | Geoff Clare | ||||||
Organization | The Open Group | ||||||
User Reference | |||||||
Section | pthread_sigmask() | ||||||
Page Number | 1734 | ||||||
Line Number | 56243 | ||||||
Interp Status | --- | ||||||
Final Accepted Text | See Note: 0006327. | ||||||
Summary | 0001731: pthread_sigmask() pending signal requirement time paradox | ||||||
Description |
In this statement:If there are any pending unblocked signals after the call to sigprocmask(), at least one of those signals shall be delivered before the call to sigprocmask() returns.the normal interpretation of "after the call" would be after the call returns, but that obviously can't be what is intended here because it states an action to be performed before the call returns, which would result in a time paradox. |
||||||
Desired Action |
After applying bug 1636, change:If there are any pending unblocked signals after the call to pthread_sigmask(), at least one of those signals shall be delivered before the call to pthread_sigmask() returns.to: If the argument set is not a null pointer and the change made to the currently blocked set of signals causes any pending signals to become unblocked, at least one of those signals shall be delivered before the call to pthread_sigmask() returns. |
||||||
Tags | applied_after_i8d3, tc3-2008 | ||||||
Attached Files | |||||||
|
Notes | |
(0006287) kre (reporter) 2023-05-23 21:08 |
I believe the proposed wording in the desired action adds an unintended subtle difference to how this is intended to work. Previously, setting aside the temporal distortion, which I agree should be fixed, the intent was that if there are *any* unblocked signals, one of them gets delivered. The proposed wording requires that one of the signals that became unblocked by the call be delivered. This makes a difference if some other (unblocked) signal just happened to arrive during the system call, which unblocks some other signal, the system is not permitted to deliver the newly arrived signal, but must deliver one of the ones that had been blocked. I don't think that change is really intended. I'd keep it simpler, something like Before the pthread_sigmask() call returns, if there are any pending unblocked signals, at least one of those shall be delivered. |
(0006288) geoffclare (manager) 2023-05-25 08:40 |
Yes, there is a subtle difference in the new wording, but I believe it better matches how people expect implementations to behave and I would be very surprised if any implementation exists that does not behave that way. Your alternative suggestion would allow an implementation to get away with not delivering any of the pending signals that the mask change unblocks if a signal that is not blocked happens to be delivered before pthread_sigmask() returns and is generated a second time during execution of a signal catching function for that signal. Why should that be allowed? |
(0006292) kre (reporter) 2023-05-25 18:26 edited on: 2023-05-25 18:32 |
Re Note: 0006288 There are two important general comments relevant here, before getting to the details of the specific issue: First: how people expect implementations to behave is not, or should not be, relevant here. What matters is how implementations do behave, not how someone would like them to. And even more important: Yes, there is a subtle difference in the new wording If that was intentional, and not just a poor choice of replacement text, then we have real problems. Attempting to insert changes in the way that the specification is written, via underhand mechanisms like that, is totally unacceptable. If the real reason for this defect report, is to make a change in the way that implementations are required to behave, then the description of the problem should have been clear about that - not just hiding things as if it were just some minor grammatical error that should be fixed, which was how it was presented. While the grammar of the current standard could be improved, to avoid the issue mentioned in the description, there is no real need to do so from the point of view of the standard itself. What is required is clear, only someone looking at it from a peculiar language perspective would even notice the problem. My "alternative suggestion" is simply restating what the standard currently says, in a way that avoids the problematic language. I'm sure there are other ways that would work as well, if you don't like my suggested wording. That is, which would work without changing the standard. As to how implementations work - it has been a while since I got down and dirty at the syscall interface, but the way all this used to work (and probably still does, since I cannot imagine much reason for changing it) is that signals cannot be delivered to an application while the kernel is currently executing code on behalf of the application (ie: when it is running the code of a system call). But signals can arrive during that period - either as a result of the system call itself, or simply by chance (sent from some other process, perhaps a child exiting causing SIGCHLD, or from another thread of this process perhaps). So the general method was, just as the system call is ending, and control is about to be returned to the user, the kernel checks to see if there are any pending unblocked signals - anything that would already have been delivered, had we not been in a system call. At this point, the code often barely knows (perhaps doesn't know at all) what system call is ending, and certainly has no idea what signals that are pending might have been generated or unmasked, by this system call. All it knows is that there is a pending signal, and that signal can now be delivered, so it is made to happen. Do you have some evidence that there are systems that behave differently than that, or do you just think there ought to be? |
(0006293) kre (reporter) 2023-05-25 18:44 |
I should also say, that testing what happens in this scenario is very hard - it is not easy to arrange for some other signal to arrive right at the precise time that it matters (not easy to prevent either). What that means is that 99.9% (nb: that is not a measured statistic) of the time, the only pending unblocked signals after this system call will be ones that it unblocked (since it does not otherwise generate signals). Some other signal, if it arrived just before the system call would have already been delivered. If it arrives just after, then one of the newly unblocked signals would have already been delivered. Arranging for things to happen so that the difference in behaviour being discussed is apparent is not at all easy. The only really sane way to investigate this is to read the code for various systems, and ask "what if?" several times. |
(0006294) geoffclare (manager) 2023-05-30 11:14 |
When I submitted this bug I had no expectation that it would be in any way controversial. I thought it was just another "everyone knows" type of bug, i.e. a class of bug that comes up from time to time whereby readers' prior knowledge of what a function does causes them not to notice that the wording in the standard does not correctly describe its behaviour, even (as in this case) if there is an error that's glaringly obvious once it has been pointed out. The time paradox in the wording here dates back to the sigprocmask() description in the original POSIX.1-1988 standard. The wording I suggested in the desired action was, I thought, a good way to describe how "everyone knows" pthread_sigmask() behaves. The subtle difference in required behaviour that it would introduce had not occurred to me until kre pointed it out. I agree with kre that we need to find out how implementations behave, and that it would be impractical to try and test it, so the best way to find out is to examine their source code. For historical perspective, since signal masks originated in BSD, we should probably also look at a version of BSD from before it started to diverge into different variants. |
(0006296) kre (reporter) 2023-05-31 19:50 |
Re Note: 0006294 we should probably also look at a version of BSD from before it started to diverge into different variants. That's easy, this is the implementation of the system call in 4.4_lite (or something around that time - from the final CSRG SCCS files). /* * Manipulate signal mask. * Note that we receive new mask, not pointer, * and return old mask as return value; * the library stub does the rest. */ int sigprocmask(p, uap, retval) register struct proc *p; struct sigprocmask_args /* { syscallarg(int) how; syscallarg(sigset_t) mask; } */ *uap; register_t *retval; { int error = 0; *retval = p->p_sigmask; (void) splhigh(); switch (SCARG(uap, how)) { case SIG_BLOCK: p->p_sigmask |= SCARG(uap, mask) &~ sigcantmask; break; case SIG_UNBLOCK: p->p_sigmask &= ~SCARG(uap, mask); break; case SIG_SETMASK: p->p_sigmask = SCARG(uap, mask) &~ sigcantmask; break; default: error = EINVAL; break; } (void) spl0(); return (error); } That's it, There are also compat sys calls, for the earlier implementation where there were separate sys calls for the functions, rather than one with an arg saying what to do (as above). int compat_43_sigblock(p, uap, retval) register struct proc *p; struct compat_43_sigblock_args /* { syscallarg(int) mask; } */ *uap; register_t *retval; { (void) splhigh(); *retval = p->p_sigmask; p->p_sigmask |= SCARG(uap, mask) &~ sigcantmask; (void) spl0(); return (0); } int compat_43_sigsetmask(p, uap, retval) struct proc *p; struct compat_43_sigsetmask_args /* { syscallarg(int) mask; } */ *uap; register_t *retval; { (void) splhigh(); *retval = p->p_sigmask; p->p_sigmask = SCARG(uap, mask) &~ sigcantmask; (void) spl0(); return (0); } The latter was used to accomplish unblock (generally by setting the mask to what was returned by sigblock()). These are called from machine depenendent code (since the way sys calls are encoded and their args passed, varies from architecture to architecture). Taking the vax as the original, and oldest, sys calls ar handled (big surprise) by the syscall() function ... I'm not going to include all of the code for that, it is fairly long, and most of what it is doing isn't relevant, but ... It starts with code that checks the syscall() call orjginated in user mode (if already in the kernel, it panics), then it works out what the PC was (the length of the instr) which varies depending upon the sys call number (how many bytes it takes to represent that) - deals with syscall(0) (the indirect sys call) and then checks the sys call number is in range. Once that is known, it can copy in the expected args (from the sys call table, which says how many there should be) after which (ignoring some ktrace noise) it does: rval[0] = 0; rval[1] = locr0[R1]; error = (*callp->sy_call)(u.u_procp, &args, rval); if (error == ERESTART) pc = opc; else if (error != EJUSTRETURN) { if (error) { locr0[R0] = error; locr0[PS] |= PSL_C; /* carry bit */ } else { locr0[R0] = rval[0]; locr0[R1] = rval[1]; locr0[PS] &= ~PSL_C; } } rval is where the results get passed back, and is init'd t what will be returned in the normal case (if a syscall has no particular result to return, and succeeds). The indirect function call calls one of the functions above in the cases of interest. If the system call indicates it should be restarted, the pc is pucked backwards (opc is the calculated pc of the system call instruction from earlier). Otherwise if error isn' EJUSTRETURN (which a sys call function can return to indicate it has already set up the result) then if there was an error, R0 (the result) gets set to the error number (which userland code will store in errno) and the carry bit is set to indicate an error occurred, otherwise the results are moved back to the user's register save area, and the carry bit is cleared. That's followed by the most relevant part for this discussion: /* * Reinitialize proc pointer `p' as it may be different * if this is a child returning from fork syscall. */ p = u.u_procp; if (i = CURSIG(p)) postsig(i); (no, not the update of 'p') - CURSIG is a macro: /* * Determine signal that should be delivered to process p, the current * process, 0 if none. If there is a pending stop signal with default * action, the process stops in issignal(). */ #define CURSIG(p) \ (((p)->p_siglist == 0 || \ ((p)->p_flag & P_TRACED) == 0 && \ ((p)->p_siglist & ~(p)->p_sigmask) == 0) ? \ 0 : issignal(p)) It could just be issignal(p) but that would mean calling the function (function calls were relatively expensive on the vax) in cases where it is obvious that all the function will do is return 0 ... so the macro extracts the most likely tests to cause that to happen, and has them evaluated in line, so we can avoid calling issignal() in a relativelu fast path piece of code (executed for every sys call made) if it obviously won't be needed. In the cases we're interested in, that won't work, and we have to call issignal() anyway. If the return from all of this is != 0, it is a signal number to deliver to the application (and postsig() does that). I'll get to issignal() soon. The rest of syscall() is just finishing up, first we check whether this process should give up the CPU (there was only one of them) or not - if we do we repeat the if (i = CURSIG(p)) postsig(i); after the context switch was made - when we get back to here in that case, there might have been a signal delivered while we weren't on the CPU, and we want that to be delivered as well). Then there's some profiling related code (adding the time spent in the sys call to the time spent at the sys call instruction for the purposes of measuring where the application is spending its time) and more ktrace stuff. Then we're done, and the application runs again. postsig() does the machine independent part of delivering a signal, handling it if the action is just to kill the process (if it was to be ignored, it would have been dropped when the signal was first generated) plus dealing with sigaction and such - to block signals while the handler is running, if needed, then calls the machine dependent sendsig() if a user signal handler needs to be called. sendsig() just establishes the application stack state, and pc, so it looks as if a call to the signal handler was just made, and arranges for when that returns (if it does) the correct actions get taken. None of that is relevant right now. All that is left is issignal() which does the work of selecting which signal to deliver. /* * If the current process has received a signal (should be caught or cause * termination, should interrupt current syscall), return the signal number. * Stop signals with default action are processed immediately, then cleared; * they aren't returned. This is checked after each entry to the system for * a syscall or trap (though this can usually be done without calling issignal * by checking the pending signal masks in the CURSIG macro.) The normal call * sequence is * * while (signum = CURSIG(curproc)) * postsig(signum); */ int issignal(p) register struct proc *p; { register int signum, mask, prop; for (;;) { mask = p->p_siglist & ~p->p_sigmask; if (p->p_flag & P_PPWAIT) mask &= ~stopsigmask; if (mask == 0) /* no signal to send */ return (0); signum = ffs((long)mask); mask = sigmask(signum); prop = sigprop[signum]; /* * We should see pending but ignored signals * only if P_TRACED was on when they were posted. */ if (mask & p->p_sigignore && (p->p_flag & P_TRACED) == 0) { p->p_siglist &= ~mask; continue; } if (p->p_flag & P_TRACED && (p->p_flag & P_PPWAIT) == 0) { /* * If traced, always stop, and stay * stopped until released by the parent. * * Note that we must clear the pending signal * before we call trace_req since that routine * might cause a fault, calling tsleep and * leading us back here again with the same signal. * Then we would be deadlocked because the tracer * would still be blocked on the ipc struct from * the initial request. */ p->p_xstat = signum; p->p_siglist &= ~mask; psignal(p->p_pptr, SIGCHLD); do { stop(p); mi_switch(); } while (!trace_req(p) && p->p_flag & P_TRACED); /* * If parent wants us to take the signal, * then it will leave it in p->p_xstat; * otherwise we just look for signals again. */ signum = p->p_xstat; if (signum == 0) continue; /* * Put the new signal into p_siglist. If the * signal is being masked, look for other signals. */ mask = sigmask(signum); p->p_siglist |= mask; if (p->p_sigmask & mask) continue; /* * If the traced bit got turned off, go back up * to the top to rescan signals. This ensures * that p_sig* and ps_sigact are consistent. */ if ((p->p_flag & P_TRACED) == 0) continue; } /* * Decide whether the signal should be returned. * Return the signal's number, or fall through * to clear it from the pending mask. */ switch ((long)p->p_sigacts->ps_sigact[signum]) { case (long)SIG_DFL: /* * Don't take default actions on system processes. */ if (p->p_pid <= 1) { #ifdef DIAGNOSTIC /* * Are you sure you want to ignore SIGSEGV * in init? XXX */ printf("Process (pid %d) got signal %d\n", p->p_pid, signum); #endif break; /* == ignore */ } /* * If there is a pending stop signal to process * with default action, stop here, * then clear the signal. However, * if process is member of an orphaned * process group, ignore tty stop signals. */ if (prop & SA_STOP) { if (p->p_flag & P_TRACED || (p->p_pgrp->pg_jobc == 0 && prop & SA_TTYSTOP)) break; /* == ignore */ p->p_xstat = signum; stop(p); if ((p->p_pptr->p_flag & P_NOCLDSTOP) == 0) psignal(p->p_pptr, SIGCHLD); mi_switch(); break; } else if (prop & SA_IGNORE) { /* * Except for SIGCONT, shouldn't get here. * Default action is to ignore; drop it. */ break; /* == ignore */ } else return (signum); /*NOTREACHED*/ case (long)SIG_IGN: /* * Masking above should prevent us ever trying * to take action on an ignored signal other * than SIGCONT, unless process is traced. */ if ((prop & SA_CONT) == 0 && (p->p_flag & P_TRACED) == 0) printf("issignal\n"); break; /* == ignore */ default: /* * This signal has an action, let * postsig() process it. */ return (signum); } p->p_siglist &= ~mask; /* take the signal! */ } That is it. In that the important lines are signum = ffs((long)mask); where "mask" is the list of signals that are pending, with any that are blocked removed. ffs() is "find first set" - which finds the lowest bit set, and returns its number (so if SIGHUP (#1) is set, and not masked, that one comes first). P_TRACED related to ptrace() - ie: running under a debugger, we can forget that case here (code that requires P_TRACED to be set should just be skipped). The code then (if not ptrace'd) simply ignores the signal if it should be ignored, and goes and finds another (after removing this one from the list of pending signals). That would happen (for example) if a signal was pending, but the current system call had just instructed it to be ignored. If it had already been ignored, it wouldn't have been pending (ptraced'd processed excepted). The switch() looks at the user's intended disposition of the signal. Here, the only one we care about is "default" (ie: not SIG_DFL or SIG_IGN) where we just do return (signum); and we're done. The signal returned is the lowest numbered pending one. It is completely irrelevant whether this is one of the ones that was just unblocked (nothing in issignal() - which makes the choice) has any idea what those might have been, and the system calls don't save that anywhere, other than by the effect that they have on p_sigmask (the list of blocked signals - manipulated by the system calls, then used here). When I said in Note: 0006292 As to how implementations work - it has been a while since I got down and dirty at the syscall interface this vintage code is what I was referring to. I already knew (without looking) that it worked like this. I would find it surprising if any implementation was significantly different. The basic strategy of signal delivery on syscall exit, except without masked signals, dates back to the early Bell Labs releases (at least 5th edition, the earliest I ever worked with). The system call function for sigprocmask in NetBSD is: /* * Manipulate signal mask. * Note that we receive new mask, not pointer, * and return old mask as return value; * the library stub does the rest. */ int sigprocmask(p, uap, retval) register struct proc *p; struct sigprocmask_args /* { syscallarg(int) how; syscallarg(sigset_t) mask; } */ *uap; register_t *retval; { int error = 0; *retval = p->p_sigmask; (void) splhigh(); switch (SCARG(uap, how)) { case SIG_BLOCK: p->p_sigmask |= SCARG(uap, mask) &~ sigcantmask; break; case SIG_UNBLOCK: p->p_sigmask &= ~SCARG(uap, mask); break; case SIG_SETMASK: p->p_sigmask = SCARG(uap, mask) &~ sigcantmask; break; default: error = EINVAL; break; } (void) spl0(); return (error); } which is rather similar to what was done in the CSRG BSD code. Nothing at all there about arranging to deliver one of the unblocked signals in the SIG_UNBLOCK case - just remove bits from sigmask, then let issignal() do its thing. The same general thing also happens after traps - but those can't be altering the signal mask, but are also far more likely to have generated a signal to deliver). Very similar code exists in NetBSD today, issignal() is still there, though ffs() has been renamed to firstsig() (the code now needs to be able to deal with more signals than fit as bits in an int). Otherwise it all looks quite similar. |
(0006312) geoffclare (manager) 2023-06-06 15:50 |
Re Note: 0006296 I poked around in the Illumos code on github and I think I have found the equivalents of most of the code you examined for BSD 4.4_lite. There is a major difference that affects the behaviour we are discussing. I won't post the code here as I know some people make a point not to look at code with certain licences, but will give links and try to describe what I'm seeing. The sigprocmask() code is in https://github.com/illumos/illumos-gate/blob/master/usr/src/uts/common/syscall/sigprocmask.c [^] It calls lwp_sigmask() to do the signal mask change. The code in there is very similar to the BSD sigprocmask() code, but in the switch it has a difference in behaviour between the cases. For SIG_UNBLOCK and SIG_SETMASK it has some code which sets the t_sig_check flag for the thread if a call to sigcheck() returns true. It does not do this for SIG_BLOCK. The code for sigcheck() is in https://github.com/illumos/illumos-gate/blob/master/usr/src/uts/common/os/sig.c [^] but everything you need to know about it is in the comment before the function definition: "Return non-zero if curthread->t_sig_check should be set to 1, that is, if there are any signals the thread might take on return from the kernel." The t_sig_check flag is used in the post_syscall() function in https://github.com/illumos/illumos-gate/blob/master/usr/src/uts/sparc/os/syscall.c [^] The code there only checks for pending signals if either t_astflag or t_sig_check is set for the thread, and has this comment: "All code that sets signals and makes ISSIG_PENDING evaluate true must set t_sig_check afterwards." Since t_sig_check isn't set by sigprocmask() when called with SIG_BLOCK, it looks like Illumos might not conform to the requirement you are proposing. |
(0006313) kre (reporter) 2023-06-06 19:24 |
Re Note: 0006312 First, I am not proposing any requirement, just to leave the requirement in this function the same as it has always been, your change was to add a new requirement (that the signal delivered be one that was just unblocked, rather than simply any pending deliverable signal). If my wording seemed to be adding something new, then use some other wording which does not. I certainly did not intend to alter anything, except to avoid the before/after issue. I see nothing in the description of the Illuminos code that you included (I am one of those "some people" though I have no idea what kind of licence Illuminos has on its code) which is different in any material way from the BSD code fragments I posted (BSD code these days is licensed mostly with a "don't sue anyone" type licence, if it matters to anyone). This doesn't mean there isn't one - just that your note did not mention it. The t_sig_check field looks like a different (and perhaps better) optimisation than what the CURSIG() macro in BSD provides; as you describe it, it merely encompasses the idea that there cannot be a signal pending unless someone has delivered one (or unblocked one) - and hence it is pointless to test whether one might now be pending (whereas the BSD code tests every time, but tries to make that test as cheap as possible). [ While it seems "obvious" it is a better way, it isn't necessarily - setting the field needs to be protected by a lock to guard against it simultaneously being cleared - that might be more costly than is warranted - or might be zero if the relevant code is already running holding a suitable lock for the purpose. And that is an extra field which makes the structure bigger, which also has costs - on fork, there is more to copy, for example, and more memory is consumed. So, this way might be improvement, or might not be, much benchmarking would be required for a definitive answer - and that might vary from system to system. ] t_sig_check doesn't need to be set in the SIG_BLOCK case, as that cannot be allowing any signals that were pending to be delivered, and it is not sending a signal of its own. I would assume that all code which sends a signal, in addition to that code which unblocks pending signals, sets t_sig_check. Check their code for the kill() system call and verify that also sets that field (on the process/thread being sent the signal of course, not the one sending it, unless they are the same), and if you feel inclined, the code that sends SIGCHLD when a child exits, which should do the same (if the signal is actually sent. or SIGALRM if a timer expires). Traps which send signals should do the same - though one of those should not normally happen to a process running inside a system call. Nothing you described has any bearing upon what signal gets delivered to the process when the sigprocmask() system all is done - whether it is one that was just unblocked, or another that happened to arrive while that system call was running. You would need to look deeper into what the post_syscall() function you mention does in that case that t_sig_check is set, to discover if it somehow makes a distinction between a signal that has been pending, and is now unblocked, and one which was never blocked, but just arrived while the system call was running (a child exiting that was running on a different CPU would, I suspect, be the most likely way that might happen - but since this is a very fast system call (only likely to block if it needs to wait on some internal data structure lock, or if is pre-empted) the chances of that happening are very small. Since you (seem to) want to change the requirement from what it has been, I'd expect it is incumbent upon you to show that at least most systems behave the way that you believe they should behave. I actually doubt that there are any - but that's just based upon the way the code was developed, and intuition, not investigation. The requirement that if a pending signal was unblocked, that it be delivered before the system call returns (which in reality means before any other application code runs after the system call returns - though it will be before the library function that made the system call returns) doesn't really need to be there at all - if there is a pending signal when any system call completes, that signal (or one of them if there is more than one) is taken before the application runs any more code (for all system calls). I suspect that the requirement is there just to make it clear that unblocking a pending signal is equivalent to sending that signal, and implementations aren't permitted to only deliver signals if the system call being run actually generated one, which is probably reasonable (someone implementing an OS from the standard, without any experience, could make that mistake). I'd suggest simply fixing the time paradox, and avoid attempting to "fix" the wording otherwise than that, as what is there now, is almost certainly exactly what it should say (things occurring after things they precede excepted, of course). Alternatively, we could just leave the time paradox there as a "quaint oddity" as it doesn't really affect anything (but no, I am not suggesting doing that, just that that would be better than the proposed change). While (re-)considering the change you are proposing, consider this case. Let's assume that SIGINT SIGQUIT and SIGTSTP are blocked, and there is a pending SIGINT waiting to be delivered, but neither of the others. The application calls pthread_sigmask() to unblock those 3 signals. In the normal course of events, the SIGINT would be delivered to the application as soon as it starts to run again - before the pthread_sigmask() call returns, and that is what you are trying to say is required. But let's assume that before the application gets put back on a CPU to run (perhaps its priority has degraded, and higher priority processes are now running on all CPUs that are available) the user types the terminal's stop character (usually ^Z) which the terminal driver turns into a progress group SIGTSTP to the process group of the terminal - one member of which, we will assume, is our application. Are you contending that the system may not deliver that SIGTSTP (instead of the SIGINT) if it so desires? Because that is what the wording you proposed would suggest. The SIGTSTP signal was not a pending signal which became unblocked by the pthread_sigmask() call, it had not occurred yet when the unblocking happened, hence it is not the case that "the change made to the currently blocked set of signals causes any pending signals to become unblocked," for that signal, but it was for SIGINT, so according to your wording, only the SIGINT can be delivered, and not the SIGTSTP. The previous (current) wording simply says that if there are any unblocked signals when the call returns (which there are now, 2 of them, SIGINT and SIGTSTP) then one of them must be delivered immediately - it does not say which, and nor should it. |
(0006314) geoffclare (manager) 2023-06-07 09:44 |
Re Note: 0006313 I should have waited before posting my previous comment, as I woke up in the night thinking much the same as what you have pointed out. The t_sig_check flag must be getting set somehow when a signal is generated during any one of the many system calls that don't change the mask, and that mechanism would take care of the SIG_BLOCK case for pthread_sigmask/sigprocmask. You suggested looking at the kill() system call, but I couldn't find that (there is no kill.c in the same directory as sigprocmask.c). However, there are various places in the sig.c file (that contains sigcheck() - see link in my previous note) which could be doing it. E.g. eat_signal() sets it, and that is called from sigtoproc() which has the comment "Post a signal. If a non-null thread pointer is passed, then post the signal to the thread/lwp, otherwise post the signal to the process." Anyway, you have convinced me that my alternative wording suggestion in the desired action is wrong and the standard should stick with saying "any pending unblocked signals". Unfortunately, your suggested wording has a different time-related problem: Before the pthread_sigmask() call returns, if there are any pending unblocked signals, at least one of those shall be delivered. Taking your example of there being a SIGINT and SIGTSTP that both qualify as a "pending unblocked signal", suppose pthread_sigmask() identifies those two and chooses to deliver the SIGTSTP before returning. Let's examine the new situation immediately after this delivery: Is "before the pthread_sigmask() call returns" true? Yes. Is "there are any pending unblocked signals" true? Yes (SIGINT is). Your wording now requires that the SIGINT is delivered before pthread_sigmask() returns. Extrapolating, it effectively would require that all pending unblocked signals are delivered before pthread_sigmask() returns. I think finding some wording that doesn't have a timing issue is not going to be simple. |
(0006316) kre (reporter) 2023-06-09 11:24 |
Re Note: 0006314 First, I don't think I've ever had a problem with the suggestions I have made for wording being improved, in fact, I usually expect that, as wordsmithing isn't really my thing - and certainly if anything I have suggested adds some unwanted requirement, that should certainly be removed. In this case, I don't see much real difference between what is in the standard now, and my suggestion (except the time paradox is gone), but as we're trying to improve that language, by all means, improve it. However wrt: Extrapolating, it effectively would require that all pending unblocked signals are delivered before pthread_sigmask() returns. That's actually what happens, in most systems anyway, though the mechanism (and precise details of how) may tend to vary. In general an application should never be running user code with a pending unblocked signal. Any signals that arrive should be processed (either cause the process to exit, or call an application signal handler (or do nothing of course)) as soon as possible. That means now (when the signal is generated) for signals that are not blocked, or as soon as the signal is unblocked for those which are. In the case of the example, when there are two pending signals when a system call returns, typically implementations behave in one of two ways. First they pick a signal that is pending to deliver (this is usually done in a deterministic way, but need not be) and set up the application state to reflect the first instruction of the user's handler routine is the next to execute, and the return address is that of the current location in the pthread_setmask() routine (immediately after the system call invoking instruction, or however the architecture makes that happen). Part of that process is applying the mask from the struct sigaction for the signal to the current signal mask - once that's done there might be no more pending unblocked signals, control returns to the application, and the handler is executed. When it completes, it notifies the system it has done so, any signals that were blocked by its sigaction become unblocked, and if there now are any pending unblocked signals, the whole process repeats, with the next pending signal (which at this stage could even be the same one for which the handler was just invoked - which can happen if the handler changes the disposition of the signal (usually) and then raises the signal again). In the case that the handler's mask did not block all the signals, and another signal remained pending and unblocked, then before resuming user mode, that signal is usually taken - which will result in things appearing to the application, if it were to look, as if the handler for the first signal invoked was interrupted immediately by the second signal - the handler for the second signal delivered runs first, then when it completes, the handler for the first signal is still waiting to executed. That might look like a strange thing to have happen, but is really no different than if the handler for the first signal selected had just started running in application mode (perhaps executed just a single machine instruction) and then the second signal arrives. Since that second signal is not blocked (that is an assumption being made here) its handler runs immediately. The effect of all of this is that yes, if there are multiple pending unblocked signals, the handlers for all of them will have been run before the pthread_sigmask() function which unblocked them (or some of them) returns. However, I don't think this function needs to require that, and it is possible that there are systems which behave differently. I assume there was some reason that the current text says "at least one of" rather than "all" (at least one doesn't preclude all, so is safe for both variations). To the unimportant parts of the note: I wouldn't expect to find a kill.c file in the kernel sources - the kill sys call is too small to warrant a file of its own - the code for it will be buried elsewhere, though I can't guess where in Illuminos (its name might also be something unexpected - that is, unexpected until you understand why it is called that - that happens sometimes when several semi-related user level functions all channel through one kernel interface - the libc code arranges for the correct thing to happen when a generic routine is called). But having it t_sig_check set by a generic signal delivery function (like that eat_signal() that you found) is how I would do it, though I doubt I would call the function by that name (but who knows). There is however nothing that needs taking care of wrt SIG_BLOCK and t_sig_check - the latter gets set when a pending signal is posted. SIG_BLOCK never causes that to happen, so doesn't need to do anything. And last, just in case it ever seems to be an issue - there is also never a problem (or shouldn't be) if t_sig_check is set, and there is no pending unblocked signal - as its name implies "check" it informs the syscall (and trap, if they're different) exit routine(s) to look for a pending signal. The only effect if it is set, when it "shouldn't be" is for the code to do a little extra work determining that - if nothing is found, it should act just the same, if fractionally slower, than if t_sig_check had not been set. |
(0006319) geoffclare (manager) 2023-06-12 08:55 edited on: 2023-06-12 14:38 |
Here's an attempt at wording that uses "any pending unblocked signals" and has no timing issues. I have kept the "set is not a null pointer" condition from the desired action in order to be able to refer to the signal mask change as a timing point. A side-effect of this is that if pthread_sigmask() is used just to obtain the current mask, there is no requirement on delivery of pending unblocked signals. This seems okay to me as an implementation might be able to satisfy an enquiry for the current mask without making a system call. After applying bug 1636, change: If there are any pending unblocked signals after the call to pthread_sigmask(), at least one of those signals shall be delivered before the call to pthread_sigmask() returns.to: If the argument set is not a null pointer, after pthread_sigmask() changes the currently blocked set of signals it shall determine whether there are any pending unblocked signals; if there are any, then at least one of those signals shall be delivered before the call to pthread_sigmask() returns. On page 1736 line 56316 section pthread_sigmask(), change APPLICATION USAGE from: None.to: Although pthread_sigmask() has to deliver at least one of any pending unblocked signals that exist after it has changed the currently blocked set of signals, there is no requirement that the delivered signal(s) include any that were unblocked by the change. In particular, when a signal is generated it can become pending for reasons other than being blocked (see [xref to 2.4.1]) and therefore the delivered signal(s) could be ones that were already unblocked but became pending during the pthread_sigmask() call. |
(0006320) kre (reporter) 2023-06-12 13:08 edited on: 2023-06-12 13:13 |
wrt Note: 0006319 The substantive change looks fine to me - I have no problem with the "If the argument set is not a null pointer" qualification, as in that case (whether or not this function is implemented as a system call in that case) no pending signals can have been unblocked - if anything unblocked had been posted earlier, it would have already been delivered. Anything that arrives just after the call will be delivered when it arrives. Anything that arrives at the same time as the call is being processed is just in a race to see which of those two cases it is considered equivalent to. Beyond that, I'd have no objection if this function were to have this whole paragraph deleted, and not say anything about signal delivery - though it is probably better that it remain. [Note: we do not say in write() that if there are any pending unblocked signals when it is about to return, that they be delivered before write returns - but if the write caused a SIGPIPE that is what happens. It just doesn't need explicit mention there. It isn't really required here either - just this is a somewhat odd case, where it probably helps.] However, assuming that we really need that APPLICATION USAGE text at all, which I doubt, it needs a wording change. In particular, signals can become pending for reasons other than being blocked makes no sense, being blocked never makes a signal become pending. Being unblocked is what does that... I know what you intended to say (or at least I can guess), which would have been fine, but those words are not it. I suspect something more like In particular, signals may have become pending for reasons other than having previously been posted, but blocked, and unblocked by this call. The delivered signal(s) could be ones that just became pending during the pthread_sigmask() call. (something like that, perhaps a little briefer). Put in the xref wherever it seems to fit best. |
(0006325) kre (reporter) 2023-06-12 22:18 edited on: 2023-06-12 22:21 |
wrt the updated Note: 0006319 I'd simply delete the whole sentence starting "In particular..." and replace it with something more like It is possible that an unrelated signal (or signals) become pending [xref] during the period the pthread_setmask() call is executing. One (or more) of those signals may be delivered to the application instead of, or in addition to, any previously blocked pending signals that were unblocked by the pthread_setmask() call. Or something along those lines, but ideally briefer. The new wording "when a signal is generated it can become pending for reasons other than being blocked" is no better than the previous attempt, all signals become pending when generated (for a short while at least, sometimes very short) all being blocked does is prevent them from being delivered for longer (until they are unblocked) - but they are always pending from the instant they are generated, until the system has an opportunity to resume execution of the process (or thread) next - in the simplest case (a system call results in a signal being posted to the invoking process) that is at least for whatever time remains during the processing/cleanup of the system call in question. In other cases (a signal delivered to a process that is swapped out) the delay before it can be resumed can be considerable - that could even happen during a pthread_sigmask() call, if the scheduler decides to run a higher priority process instead of the caller, which is then blocked waiting upon a suitable CPU, and could get swapped out while it is waiting. Unlikely perhaps on modern systems, but on some ancient systems, that kind of thing happened all the time, and is still possible. |
(0006326) geoffclare (manager) 2023-06-13 09:29 |
Re Note: 0006325 > all signals become pending when generated (for a short while at least, > sometimes very short) all being blocked does is prevent them from being > delivered for longer (until they are unblocked) - but they are always > pending from the instant they are generated, until the system has an > opportunity to resume execution of the process (or thread) next You don't seem to have considered the case where the thread that receives the signal is currently running, and is executing user code. Then there is nothing to delay the delivery - it can happen immediately after generation, if the signal is not being blocked; it only becomes pending if it is being blocked. Having said that, it would make sense to reword to avoid any subtle distinctions about exactly when a signal becomes pending. I will try to come up with something that merges parts of your suggestion with parts of my previous attempt. |
(0006327) geoffclare (manager) 2023-06-13 09:47 |
New suggested resolution ... After applying bug 1636, change: If there are any pending unblocked signals after the call to pthread_sigmask(), at least one of those signals shall be delivered before the call to pthread_sigmask() returns.to: If the argument set is not a null pointer, after pthread_sigmask() changes the currently blocked set of signals it shall determine whether there are any pending unblocked signals; if there are any, then at least one of those signals shall be delivered before the call to pthread_sigmask() returns. On page 1736 line 56316 section pthread_sigmask(), change APPLICATION USAGE from: None.to: Although pthread_sigmask() has to deliver at least one of any pending unblocked signals that exist after it has changed the currently blocked set of signals, there is no requirement that the delivered signal(s) include any that were unblocked by the change. If one or more signals that were already unblocked become pending (see [xref to 2.4.1]) during the period the pthread_setmask() call is executing, the signal(s) delivered before the call returns might include only those signals. |
Issue History | |||
Date Modified | Username | Field | Change |
2023-05-23 09:43 | geoffclare | New Issue | |
2023-05-23 09:43 | geoffclare | Name | => Geoff Clare |
2023-05-23 09:43 | geoffclare | Organization | => The Open Group |
2023-05-23 09:43 | geoffclare | Section | => pthread_sigmask() |
2023-05-23 09:43 | geoffclare | Page Number | => 1734 |
2023-05-23 09:43 | geoffclare | Line Number | => 56243 |
2023-05-23 09:43 | geoffclare | Interp Status | => --- |
2023-05-23 21:08 | kre | Note Added: 0006287 | |
2023-05-25 08:40 | geoffclare | Note Added: 0006288 | |
2023-05-25 18:26 | kre | Note Added: 0006292 | |
2023-05-25 18:32 | kre | Note Edited: 0006292 | |
2023-05-25 18:44 | kre | Note Added: 0006293 | |
2023-05-30 11:14 | geoffclare | Note Added: 0006294 | |
2023-05-31 19:50 | kre | Note Added: 0006296 | |
2023-06-06 15:50 | geoffclare | Note Added: 0006312 | |
2023-06-06 19:24 | kre | Note Added: 0006313 | |
2023-06-07 09:44 | geoffclare | Note Added: 0006314 | |
2023-06-09 11:24 | kre | Note Added: 0006316 | |
2023-06-12 08:55 | geoffclare | Note Added: 0006319 | |
2023-06-12 09:08 | geoffclare | Note Edited: 0006319 | |
2023-06-12 13:08 | kre | Note Added: 0006320 | |
2023-06-12 13:13 | kre | Note Edited: 0006320 | |
2023-06-12 14:38 | geoffclare | Note Edited: 0006319 | |
2023-06-12 22:18 | kre | Note Added: 0006325 | |
2023-06-12 22:21 | kre | Note Edited: 0006325 | |
2023-06-13 09:09 | dannyniu | Issue Monitored: dannyniu | |
2023-06-13 09:29 | geoffclare | Note Added: 0006326 | |
2023-06-13 09:47 | geoffclare | Note Added: 0006327 | |
2023-07-31 15:38 | Don Cragun | Final Accepted Text | => See Note: 0006327. |
2023-07-31 15:38 | Don Cragun | Status | New => Resolved |
2023-07-31 15:38 | Don Cragun | Resolution | Open => Accepted As Marked |
2023-07-31 15:39 | Don Cragun | Tag Attached: tc3-2008 | |
2023-08-17 11:01 | geoffclare | Status | Resolved => Applied |
2023-08-17 11:01 | geoffclare | Tag Attached: applied_after_i8d3 | |
2024-06-11 09:07 | agadmin | Status | Applied => Closed |
Mantis 1.1.6[^] Copyright © 2000 - 2008 Mantis Group |