0001731: pthread_sigmask() pending signal requirement time paradox

(0006287)
kre (reporter)
2023-05-23 21:08

I believe the proposed wording in the desired action adds an
unintended subtle difference to how this is intended to work.

Previously, setting aside the temporal distortion, which I agree
should be fixed, the intent was that if there are *any* unblocked
signals, one of them gets delivered.

The proposed wording requires that one of the signals that
became unblocked by the call be delivered.

This makes a difference if some other (unblocked) signal just
happened to arrive during the system call, which unblocks some
other signal, the system is not permitted to deliver the newly
arrived signal, but must deliver one of the ones that had been
blocked. I don't think that change is really intended.

I'd keep it simpler, something like

    Before the pthread_sigmask() call returns, if there are any
    pending unblocked signals, at least one of those shall be
    delivered.

(0006288)
geoffclare (manager)
2023-05-25 08:40

Yes, there is a subtle difference in the new wording, but I believe it better matches how people expect implementations to behave and I would be very surprised if any implementation exists that does not behave that way.

Your alternative suggestion would allow an implementation to get away with not delivering any of the pending signals that the mask change unblocks if a signal that is not blocked happens to be delivered before pthread_sigmask() returns and is generated a second time during execution of a signal catching function for that signal. Why should that be allowed?

(0006292)
kre (reporter)
2023-05-25 18:26
edited on: 2023-05-25 18:32

Re Note: 0006288

There are two important general comments relevant here, before getting
to the details of the specific issue:

First:

how people expect implementations to behave

is not, or should not be, relevant here. What matters is how
implementations do behave, not how someone would like them to.

And even more important:

Yes, there is a subtle difference in the new wording

If that was intentional, and not just a poor choice of replacement
text, then we have real problems.

Attempting to insert changes in the way that the specification is
written, via underhand mechanisms like that, is totally unacceptable.

If the real reason for this defect report, is to make a change in the
way that implementations are required to behave, then the description
of the problem should have been clear about that - not just hiding things
as if it were just some minor grammatical error that should be fixed, which
was how it was presented.

While the grammar of the current standard could be improved, to avoid the
issue mentioned in the description, there is no real need to do so from the
point of view of the standard itself. What is required is clear, only
someone looking at it from a peculiar language perspective would even notice
the problem.

My "alternative suggestion" is simply restating what the standard currently
says, in a way that avoids the problematic language. I'm sure there are
other ways that would work as well, if you don't like my suggested wording.
That is, which would work without changing the standard.

As to how implementations work - it has been a while since I got down and
dirty at the syscall interface, but the way all this used to work (and probably
still does, since I cannot imagine much reason for changing it) is that
signals cannot be delivered to an application while the kernel is currently
executing code on behalf of the application (ie: when it is running the code
of a system call). But signals can arrive during that period - either as
a result of the system call itself, or simply by chance (sent from some other
process, perhaps a child exiting causing SIGCHLD, or from another thread of
this process perhaps).

So the general method was, just as the system call is ending, and control
is about to be returned to the user, the kernel checks to see if there are
any pending unblocked signals - anything that would already have been
delivered, had we not been in a system call. At this point, the code
often barely knows (perhaps doesn't know at all) what system call is
ending, and certainly has no idea what signals that are pending might have
been generated or unmasked, by this system call. All it knows is that there
is a pending signal, and that signal can now be delivered, so it is made to
happen.

Do you have some evidence that there are systems that behave differently
than that, or do you just think there ought to be?

(0006293)
kre (reporter)
2023-05-25 18:44

I should also say, that testing what happens in this scenario is
very hard - it is not easy to arrange for some other signal to arrive
right at the precise time that it matters (not easy to prevent either).

What that means is that 99.9% (nb: that is not a measured statistic)
of the time, the only pending unblocked signals after this system call
will be ones that it unblocked (since it does not otherwise generate
signals). Some other signal, if it arrived just before the system
call would have already been delivered. If it arrives just after, then
one of the newly unblocked signals would have already been delivered.

Arranging for things to happen so that the difference in behaviour
being discussed is apparent is not at all easy. The only really sane
way to investigate this is to read the code for various systems, and
ask "what if?" several times.

(0006294)
geoffclare (manager)
2023-05-30 11:14

When I submitted this bug I had no expectation that it would be in any way controversial. I thought it was just another "everyone knows" type of bug, i.e. a class of bug that comes up from time to time whereby readers' prior knowledge of what a function does causes them not to notice that the wording in the standard does not correctly describe its behaviour, even (as in this case) if there is an error that's glaringly obvious once it has been pointed out. The time paradox in the wording here dates back to the sigprocmask() description in the original POSIX.1-1988 standard.

The wording I suggested in the desired action was, I thought, a good way to describe how "everyone knows" pthread_sigmask() behaves. The subtle difference in required behaviour that it would introduce had not occurred to me until kre pointed it out.

I agree with kre that we need to find out how implementations behave, and that it would be impractical to try and test it, so the best way to find out is to examine their source code. For historical perspective, since signal masks originated in BSD, we should probably also look at a version of BSD from before it started to diverge into different variants.

(0006296)
kre (reporter)
2023-05-31 19:50

Re Note: 0006294

  we should probably also look at a version of BSD from before it started to
  diverge into different variants.

That's easy, this is the implementation of the system call in 4.4_lite
(or something around that time - from the final CSRG SCCS files).

/*
* Manipulate signal mask.
* Note that we receive new mask, not pointer,
* and return old mask as return value;
* the library stub does the rest.
*/
int
sigprocmask(p, uap, retval)
        register struct proc *p;
        struct sigprocmask_args /* {
                syscallarg(int) how;
                syscallarg(sigset_t) mask;
        } */ *uap;
        register_t *retval;
{
        int error = 0;

        *retval = p->p_sigmask;
        (void) splhigh();

        switch (SCARG(uap, how)) {
        case SIG_BLOCK:
                p->p_sigmask |= SCARG(uap, mask) &~ sigcantmask;
                break;

        case SIG_UNBLOCK:
                p->p_sigmask &= ~SCARG(uap, mask);
                break;

        case SIG_SETMASK:
                p->p_sigmask = SCARG(uap, mask) &~ sigcantmask;
                break;

        default:
                error = EINVAL;
                break;
        }
        (void) spl0();
        return (error);
}

That's it, There are also compat sys calls, for the earlier
implementation where there were separate sys calls for the functions,
rather than one with an arg saying what to do (as above).

int
compat_43_sigblock(p, uap, retval)
        register struct proc *p;
        struct compat_43_sigblock_args /* {
                syscallarg(int) mask;
        } */ *uap;
        register_t *retval;
{

        (void) splhigh();
        *retval = p->p_sigmask;
        p->p_sigmask |= SCARG(uap, mask) &~ sigcantmask;
        (void) spl0();
        return (0);
}

int
compat_43_sigsetmask(p, uap, retval)
        struct proc *p;
        struct compat_43_sigsetmask_args /* {
                syscallarg(int) mask;
        } */ *uap;
        register_t *retval;
{

        (void) splhigh();
        *retval = p->p_sigmask;
        p->p_sigmask = SCARG(uap, mask) &~ sigcantmask;
        (void) spl0();
        return (0);
}

The latter was used to accomplish unblock (generally by setting the
mask to what was returned by sigblock()).

These are called from machine depenendent code (since the way sys calls
are encoded and their args passed, varies from architecture to architecture).

Taking the vax as the original, and oldest, sys calls ar handled (big surprise)
by the syscall() function ... I'm not going to include all of the code for that,
it is fairly long, and most of what it is doing isn't relevant, but ...

It starts with code that checks the syscall() call orjginated in user mode
(if already in the kernel, it panics), then it works out what the PC was
(the length of the instr) which varies depending upon the sys call number
(how many bytes it takes to represent that) - deals with syscall(0) (the
indirect sys call) and then checks the sys call number is in range. Once
that is known, it can copy in the expected args (from the sys call table,
which says how many there should be) after which (ignoring some ktrace noise)
it does:

        rval[0] = 0;
        rval[1] = locr0[R1];
        error = (*callp->sy_call)(u.u_procp, &args, rval);
        if (error == ERESTART)
                pc = opc;
        else if (error != EJUSTRETURN) {
                if (error) {
                        locr0[R0] = error;
                        locr0[PS] |= PSL_C; /* carry bit */
                } else {
                        locr0[R0] = rval[0];
                        locr0[R1] = rval[1];
                        locr0[PS] &= ~PSL_C;
                }
        }

rval is where the results get passed back, and is init'd t what will be
returned in the normal case (if a syscall has no particular result to
return, and succeeds). The indirect function call calls one of the
functions above in the cases of interest. If the system call indicates
it should be restarted, the pc is pucked backwards (opc is the calculated
pc of the system call instruction from earlier). Otherwise if error
isn' EJUSTRETURN (which a sys call function can return to indicate it has
already set up the result) then if there was an error, R0 (the result) gets
set to the error number (which userland code will store in errno) and the
carry bit is set to indicate an error occurred, otherwise the results are
moved back to the user's register save area, and the carry bit is cleared.

That's followed by the most relevant part for this discussion:

        /*
         * Reinitialize proc pointer `p' as it may be different
         * if this is a child returning from fork syscall.
         */
        p = u.u_procp;
        if (i = CURSIG(p))
                postsig(i);

(no, not the update of 'p') - CURSIG is a macro:

/*
* Determine signal that should be delivered to process p, the current
* process, 0 if none. If there is a pending stop signal with default
* action, the process stops in issignal().
*/
#define CURSIG(p) \
        (((p)->p_siglist == 0 || \
            ((p)->p_flag & P_TRACED) == 0 && \
            ((p)->p_siglist & ~(p)->p_sigmask) == 0) ? \
            0 : issignal(p))

It could just be issignal(p) but that would mean calling the function
(function calls were relatively expensive on the vax) in cases where it is
obvious that all the function will do is return 0 ... so the macro extracts
the most likely tests to cause that to happen, and has them evaluated in
line, so we can avoid calling issignal() in a relativelu fast path piece of
code (executed for every sys call made) if it obviously won't be needed.

In the cases we're interested in, that won't work, and we have to call
issignal() anyway. If the return from all of this is != 0, it is a
signal number to deliver to the application (and postsig() does that).

I'll get to issignal() soon. The rest of syscall() is just finishing up,
first we check whether this process should give up the CPU (there was only
one of them) or not - if we do we repeat the

                if (i = CURSIG(p))
                        postsig(i);

after the context switch was made - when we get back to here in that case,
there might have been a signal delivered while we weren't on the CPU, and
we want that to be delivered as well).

Then there's some profiling related code (adding the time spent in the sys call
to the time spent at the sys call instruction for the purposes of measuring
where the application is spending its time) and more ktrace stuff.

Then we're done, and the application runs again.

postsig() does the machine independent part of delivering a signal, handling
it if the action is just to kill the process (if it was to be ignored, it
would have been dropped when the signal was first generated) plus dealing
with sigaction and such - to block signals while the handler is running, if
needed, then calls the machine dependent sendsig() if a user signal handler
needs to be called.

sendsig() just establishes the application stack state, and pc, so it looks
as if a call to the signal handler was just made, and arranges for when that
returns (if it does) the correct actions get taken. None of that is relevant
right now.

All that is left is issignal() which does the work of selecting which signal
to deliver.

/*
* If the current process has received a signal (should be caught or cause
* termination, should interrupt current syscall), return the signal number.
* Stop signals with default action are processed immediately, then cleared;
* they aren't returned. This is checked after each entry to the system for
* a syscall or trap (though this can usually be done without calling issignal
* by checking the pending signal masks in the CURSIG macro.) The normal call
* sequence is
*
* while (signum = CURSIG(curproc))
* postsig(signum);
*/
int
issignal(p)
        register struct proc *p;
{
        register int signum, mask, prop;

        for (;;) {
                mask = p->p_siglist & ~p->p_sigmask;
                if (p->p_flag & P_PPWAIT)
                        mask &= ~stopsigmask;
                if (mask == 0) /* no signal to send */
                        return (0);
                signum = ffs((long)mask);
                mask = sigmask(signum);
                prop = sigprop[signum];
                /*
                 * We should see pending but ignored signals
                 * only if P_TRACED was on when they were posted.
                 */
                if (mask & p->p_sigignore && (p->p_flag & P_TRACED) == 0) {
                        p->p_siglist &= ~mask;
                        continue;
                }
                if (p->p_flag & P_TRACED && (p->p_flag & P_PPWAIT) == 0) {
                        /*
                         * If traced, always stop, and stay
                         * stopped until released by the parent.
                         *
                         * Note that we must clear the pending signal
                         * before we call trace_req since that routine
                         * might cause a fault, calling tsleep and
                         * leading us back here again with the same signal.
                         * Then we would be deadlocked because the tracer
                         * would still be blocked on the ipc struct from
                         * the initial request.
                         */
                        p->p_xstat = signum;
                        p->p_siglist &= ~mask;
                        psignal(p->p_pptr, SIGCHLD);
                        do {
                                stop(p);
                                mi_switch();
                        } while (!trace_req(p) && p->p_flag & P_TRACED);

                        /*
                         * If parent wants us to take the signal,
                         * then it will leave it in p->p_xstat;
                         * otherwise we just look for signals again.
                         */
                        signum = p->p_xstat;
                        if (signum == 0)
                                continue;

                        /*
                         * Put the new signal into p_siglist. If the
                         * signal is being masked, look for other signals.
                         */
                        mask = sigmask(signum);
                        p->p_siglist |= mask;
                        if (p->p_sigmask & mask)
                                continue;

                        /*
                         * If the traced bit got turned off, go back up
                         * to the top to rescan signals. This ensures
                         * that p_sig* and ps_sigact are consistent.
                         */
                        if ((p->p_flag & P_TRACED) == 0)
                                continue;
                }

                /*
                 * Decide whether the signal should be returned.
                 * Return the signal's number, or fall through
                 * to clear it from the pending mask.
                 */
                switch ((long)p->p_sigacts->ps_sigact[signum]) {

                case (long)SIG_DFL:
                        /*
                         * Don't take default actions on system processes.
                         */
                        if (p->p_pid <= 1) {
#ifdef DIAGNOSTIC
                                /*
                                 * Are you sure you want to ignore SIGSEGV
                                 * in init? XXX
                                 */
                                printf("Process (pid %d) got signal %d\n",
                                        p->p_pid, signum);
#endif
                                break; /* == ignore */
                        }
                        /*
                         * If there is a pending stop signal to process
                         * with default action, stop here,
                         * then clear the signal. However,
                         * if process is member of an orphaned
                         * process group, ignore tty stop signals.
                         */
                        if (prop & SA_STOP) {
                                if (p->p_flag & P_TRACED ||
                                    (p->p_pgrp->pg_jobc == 0 &&
                                    prop & SA_TTYSTOP))
                                        break; /* == ignore */
                                p->p_xstat = signum;
                                stop(p);
                                if ((p->p_pptr->p_flag & P_NOCLDSTOP) == 0)
                                        psignal(p->p_pptr, SIGCHLD);
                                mi_switch();
                                break;
                        } else if (prop & SA_IGNORE) {
                                /*
                                 * Except for SIGCONT, shouldn't get here.
                                 * Default action is to ignore; drop it.
                                 */
                                break; /* == ignore */
                        } else
                                return (signum);
                        /*NOTREACHED*/

                case (long)SIG_IGN:
                        /*
                         * Masking above should prevent us ever trying
                         * to take action on an ignored signal other
                         * than SIGCONT, unless process is traced.
                         */
                        if ((prop & SA_CONT) == 0 &&
                            (p->p_flag & P_TRACED) == 0)
                                printf("issignal\n");
                        break; /* == ignore */

                default:
                        /*
                         * This signal has an action, let
                         * postsig() process it.
                         */
                        return (signum);
                }
                p->p_siglist &= ~mask; /* take the signal! */
        }

That is it. In that the important lines are

                signum = ffs((long)mask);

where "mask" is the list of signals that are pending, with any that are
blocked removed. ffs() is "find first set" - which finds the lowest bit
set, and returns its number (so if SIGHUP (#1) is set, and not masked, that
one comes first).

P_TRACED related to ptrace() - ie: running under a debugger, we can forget
that case here (code that requires P_TRACED to be set should just be skipped).

The code then (if not ptrace'd) simply ignores the signal if it should be
ignored, and goes and finds another (after removing this one from the list
of pending signals). That would happen (for example) if a signal was
pending, but the current system call had just instructed it to be ignored.
If it had already been ignored, it wouldn't have been pending (ptraced'd
processed excepted).

The switch() looks at the user's intended disposition of the signal.

Here, the only one we care about is "default" (ie: not SIG_DFL or SIG_IGN)
where we just do

                        return (signum);

and we're done. The signal returned is the lowest numbered pending one.

It is completely irrelevant whether this is one of the ones that was just
unblocked (nothing in issignal() - which makes the choice) has any idea
what those might have been, and the system calls don't save that anywhere,
other than by the effect that they have on p_sigmask (the list of blocked
signals - manipulated by the system calls, then used here).

When I said in Note: 0006292
    As to how implementations work - it has been a while since I got down
    and dirty at the syscall interface

this vintage code is what I was referring to. I already knew (without
looking) that it worked like this.

I would find it surprising if any implementation was significantly different.
The basic strategy of signal delivery on syscall exit, except without masked
signals, dates back to the early Bell Labs releases (at least 5th edition,
the earliest I ever worked with).

The system call function for sigprocmask in NetBSD is:

/*
* Manipulate signal mask.
* Note that we receive new mask, not pointer,
* and return old mask as return value;
* the library stub does the rest.
*/
int
sigprocmask(p, uap, retval)
        register struct proc *p;
        struct sigprocmask_args /* {
                syscallarg(int) how;
                syscallarg(sigset_t) mask;
        } */ *uap;
        register_t *retval;
{
        int error = 0;

        *retval = p->p_sigmask;
        (void) splhigh();

        switch (SCARG(uap, how)) {
        case SIG_BLOCK:
                p->p_sigmask |= SCARG(uap, mask) &~ sigcantmask;
                break;

        case SIG_UNBLOCK:
                p->p_sigmask &= ~SCARG(uap, mask);
                break;

        case SIG_SETMASK:
                p->p_sigmask = SCARG(uap, mask) &~ sigcantmask;
                break;

        default:
                error = EINVAL;
                break;
        }
        (void) spl0();
        return (error);
}

which is rather similar to what was done in the CSRG BSD code.
Nothing at all there about arranging to deliver one of the unblocked
signals in the SIG_UNBLOCK case - just remove bits from sigmask, then
let issignal() do its thing.

The same general thing also happens after traps - but those can't be altering
the signal mask, but are also far more likely to have generated a signal to
deliver).

Very similar code exists in NetBSD today, issignal() is still there, though
ffs() has been renamed to firstsig() (the code now needs to be able to deal
with more signals than fit as bits in an int). Otherwise it all looks quite
similar.

(0006312)
geoffclare (manager)
2023-06-06 15:50

Re Note: 0006296 I poked around in the Illumos code on github and I think I have found the equivalents of most of the code you examined for BSD 4.4_lite. There is a major difference that affects the behaviour we are discussing. I won't post the code here as I know some people make a point not to look at code with certain licences, but will give links and try to describe what I'm seeing.

The sigprocmask() code is in https://github.com/illumos/illumos-gate/blob/master/usr/src/uts/common/syscall/sigprocmask.c [^]

It calls lwp_sigmask() to do the signal mask change. The code in there is very similar to the BSD sigprocmask() code, but in the switch it has a difference in behaviour between the cases. For SIG_UNBLOCK and SIG_SETMASK it has some code which sets the t_sig_check flag for the thread if a call to sigcheck() returns true. It does not do this for SIG_BLOCK. The code for sigcheck() is in https://github.com/illumos/illumos-gate/blob/master/usr/src/uts/common/os/sig.c [^] but everything you need to know about it is in the comment before the function definition: "Return non-zero if curthread->t_sig_check should be set to 1, that is, if there are any signals the thread might take on return from the kernel."

The t_sig_check flag is used in the post_syscall() function in https://github.com/illumos/illumos-gate/blob/master/usr/src/uts/sparc/os/syscall.c [^]

The code there only checks for pending signals if either t_astflag or t_sig_check is set for the thread, and has this comment: "All code that sets signals and makes ISSIG_PENDING evaluate true must set t_sig_check afterwards."

Since t_sig_check isn't set by sigprocmask() when called with SIG_BLOCK, it looks like Illumos might not conform to the requirement you are proposing.

(0006313)
kre (reporter)
2023-06-06 19:24

Re Note: 0006312

First, I am not proposing any requirement, just to leave the requirement
in this function the same as it has always been, your change was to add
a new requirement (that the signal delivered be one that was just unblocked,
rather than simply any pending deliverable signal). If my wording seemed
to be adding something new, then use some other wording which does not.
I certainly did not intend to alter anything, except to avoid the before/after
issue.

I see nothing in the description of the Illuminos code that you included
(I am one of those "some people" though I have no idea what kind of licence
Illuminos has on its code) which is different in any material way from the
BSD code fragments I posted (BSD code these days is licensed mostly with a
"don't sue anyone" type licence, if it matters to anyone). This doesn't mean
there isn't one - just that your note did not mention it.

The t_sig_check field looks like a different (and perhaps better) optimisation
than what the CURSIG() macro in BSD provides; as you describe it, it merely
encompasses the idea that there cannot be a signal pending unless someone has
delivered one (or unblocked one) - and hence it is pointless to test whether
one might now be pending (whereas the BSD code tests every time, but tries to
make that test as cheap as possible). [ While it seems "obvious" it is
a better way, it isn't necessarily - setting the field needs to be protected
by a lock to guard against it simultaneously being cleared - that might be
more costly than is warranted - or might be zero if the relevant code is
already running holding a suitable lock for the purpose. And that is an
extra field which makes the structure bigger, which also has costs - on fork,
there is more to copy, for example, and more memory is consumed. So, this
way might be improvement, or might not be, much benchmarking would be required
for a definitive answer - and that might vary from system to system. ]

t_sig_check doesn't need to be set in the SIG_BLOCK case, as that cannot be
allowing any signals that were pending to be delivered, and it is not sending
a signal of its own. I would assume that all code which sends a signal,
in addition to that code which unblocks pending signals, sets t_sig_check.
Check their code for the kill() system call and verify that also sets that
field (on the process/thread being sent the signal of course, not the one
sending it, unless they are the same), and if you feel inclined, the code
that sends SIGCHLD when a child exits, which should do the same (if the
signal is actually sent. or SIGALRM if a timer expires).

Traps which send signals should do the same - though one of those should not
normally happen to a process running inside a system call.

Nothing you described has any bearing upon what signal gets delivered to the
process when the sigprocmask() system all is done - whether it is one that
was just unblocked, or another that happened to arrive while that system call
was running. You would need to look deeper into what the post_syscall()
function you mention does in that case that t_sig_check is set, to discover
if it somehow makes a distinction between a signal that has been pending, and
is now unblocked, and one which was never blocked, but just arrived while the
system call was running (a child exiting that was running on a different CPU
would, I suspect, be the most likely way that might happen - but since
this is a very fast system call (only likely to block if it needs to wait on
some internal data structure lock, or if is pre-empted) the chances of that
happening are very small.

Since you (seem to) want to change the requirement from what it has been,
I'd expect it is incumbent upon you to show that at least most systems behave
the way that you believe they should behave. I actually doubt that there
are any - but that's just based upon the way the code was developed, and
intuition, not investigation.

The requirement that if a pending signal was unblocked, that it be delivered
before the system call returns (which in reality means before any other
application code runs after the system call returns - though it will be
before the library function that made the system call returns) doesn't
really need to be there at all - if there is a pending signal when any
system call completes, that signal (or one of them if there is more than one)
is taken before the application runs any more code (for all system calls).
I suspect that the requirement is there just to make it clear that unblocking
a pending signal is equivalent to sending that signal, and implementations
aren't permitted to only deliver signals if the system call being run
actually generated one, which is probably reasonable (someone implementing
an OS from the standard, without any experience, could make that mistake).

I'd suggest simply fixing the time paradox, and avoid attempting to "fix"
the wording otherwise than that, as what is there now, is almost certainly
exactly what it should say (things occurring after things they precede
excepted, of course). Alternatively, we could just leave the time paradox
there as a "quaint oddity" as it doesn't really affect anything (but no, I
am not suggesting doing that, just that that would be better than the
proposed change).

While (re-)considering the change you are proposing, consider this case.

Let's assume that SIGINT SIGQUIT and SIGTSTP are blocked, and there is
a pending SIGINT waiting to be delivered, but neither of the others.

The application calls pthread_sigmask() to unblock those 3 signals. In
the normal course of events, the SIGINT would be delivered to the application
as soon as it starts to run again - before the pthread_sigmask() call returns,
and that is what you are trying to say is required. But let's assume that
before the application gets put back on a CPU to run (perhaps its priority
has degraded, and higher priority processes are now running on all CPUs that
are available) the user types the terminal's stop character (usually ^Z)
which the terminal driver turns into a progress group SIGTSTP to the process
group of the terminal - one member of which, we will assume, is our application.

Are you contending that the system may not deliver that SIGTSTP (instead of
the SIGINT) if it so desires? Because that is what the wording you proposed
would suggest. The SIGTSTP signal was not a pending signal which became
unblocked by the pthread_sigmask() call, it had not occurred yet when the
unblocking happened, hence it is not the case that "the change made to the
currently blocked set of signals causes any pending signals to become
unblocked," for that signal, but it was for SIGINT, so according to your
wording, only the SIGINT can be delivered, and not the SIGTSTP.

The previous (current) wording simply says that if there are any unblocked
signals when the call returns (which there are now, 2 of them, SIGINT and
SIGTSTP) then one of them must be delivered immediately - it does not say
which, and nor should it.

(0006314)
geoffclare (manager)
2023-06-07 09:44

Re Note: 0006313 I should have waited before posting my previous comment, as I woke up in the night thinking much the same as what you have pointed out. The t_sig_check flag must be getting set somehow when a signal is generated during any one of the many system calls that don't change the mask, and that mechanism would take care of the SIG_BLOCK case for pthread_sigmask/sigprocmask. You suggested looking at the kill() system call, but I couldn't find that (there is no kill.c in the same directory as sigprocmask.c). However, there are various places in the sig.c file (that contains sigcheck() - see link in my previous note) which could be doing it. E.g. eat_signal() sets it, and that is called from sigtoproc() which has the comment "Post a signal. If a non-null thread pointer is passed, then post the signal to the thread/lwp, otherwise post the signal to the process."

Anyway, you have convinced me that my alternative wording suggestion in the desired action is wrong and the standard should stick with saying "any pending unblocked signals". Unfortunately, your suggested wording has a different time-related problem:

    Before the pthread_sigmask() call returns, if there are any
    pending unblocked signals, at least one of those shall be
    delivered.

Taking your example of there being a SIGINT and SIGTSTP that both qualify as a "pending unblocked signal", suppose pthread_sigmask() identifies those two and chooses to deliver the SIGTSTP before returning. Let's examine the new situation immediately after this delivery:

Is "before the pthread_sigmask() call returns" true? Yes.
Is "there are any pending unblocked signals" true? Yes (SIGINT is).

Your wording now requires that the SIGINT is delivered before pthread_sigmask() returns. Extrapolating, it effectively would require that all pending unblocked signals are delivered before pthread_sigmask() returns.

I think finding some wording that doesn't have a timing issue is not going to be simple.

(0006316)
kre (reporter)
2023-06-09 11:24

Re Note: 0006314

First, I don't think I've ever had a problem with the suggestions I have
made for wording being improved, in fact, I usually expect that, as
wordsmithing isn't really my thing - and certainly if anything I have
suggested adds some unwanted requirement, that should certainly be removed.

In this case, I don't see much real difference between what is in the standard
now, and my suggestion (except the time paradox is gone), but as we're trying
to improve that language, by all means, improve it.

However wrt:

Extrapolating, it effectively would require that all pending unblocked
signals are delivered before pthread_sigmask() returns.

That's actually what happens, in most systems anyway, though the
mechanism (and precise details of how) may tend to vary. In general
an application should never be running user code with a pending unblocked
signal. Any signals that arrive should be processed (either cause the process
to exit, or call an application signal handler (or do nothing of course))
as soon as possible. That means now (when the signal is generated) for
signals that are not blocked, or as soon as the signal is unblocked for
those which are.

In the case of the example, when there are two pending signals when a
system call returns, typically implementations behave in one of two ways.
First they pick a signal that is pending to deliver (this is usually done
in a deterministic way, but need not be) and set up the application state
to reflect the first instruction of the user's handler routine is the next
to execute, and the return address is that of the current location in the
pthread_setmask() routine (immediately after the system call invoking
instruction, or however the architecture makes that happen). Part of that
process is applying the mask from the struct sigaction for the signal to
the current signal mask - once that's done there might be no more pending
unblocked signals, control returns to the application, and the handler is
executed. When it completes, it notifies the system it has done so, any
signals that were blocked by its sigaction become unblocked, and if there
now are any pending unblocked signals, the whole process repeats, with the
next pending signal (which at this stage could even be the same one for
which the handler was just invoked - which can happen if the handler
changes the disposition of the signal (usually) and then raises the signal
again).

In the case that the handler's mask did not block all the signals, and
another signal remained pending and unblocked, then before resuming user
mode, that signal is usually taken - which will result in things appearing
to the application, if it were to look, as if the handler for the first
signal invoked was interrupted immediately by the second signal - the handler
for the second signal delivered runs first, then when it completes, the
handler for the first signal is still waiting to executed. That might
look like a strange thing to have happen, but is really no different than
if the handler for the first signal selected had just started running in
application mode (perhaps executed just a single machine instruction) and
then the second signal arrives. Since that second signal is not blocked
(that is an assumption being made here) its handler runs immediately.

The effect of all of this is that yes, if there are multiple pending
unblocked signals, the handlers for all of them will have been run before
the pthread_sigmask() function which unblocked them (or some of them)
returns.

However, I don't think this function needs to require that, and it is
possible that there are systems which behave differently. I assume there
was some reason that the current text says "at least one of" rather than
"all" (at least one doesn't preclude all, so is safe for both variations).

To the unimportant parts of the note:

I wouldn't expect to find a kill.c file in the kernel sources - the kill
sys call is too small to warrant a file of its own - the code for it will
be buried elsewhere, though I can't guess where in Illuminos (its name might
also be something unexpected - that is, unexpected until you understand
why it is called that - that happens sometimes when several semi-related
user level functions all channel through one kernel interface - the libc
code arranges for the correct thing to happen when a generic routine is
called). But having it t_sig_check set by a generic signal delivery
function (like that eat_signal() that you found) is how I would do it, though
I doubt I would call the function by that name (but who knows).

There is however nothing that needs taking care of wrt SIG_BLOCK and
t_sig_check - the latter gets set when a pending signal is posted.
SIG_BLOCK never causes that to happen, so doesn't need to do anything.

And last, just in case it ever seems to be an issue - there is also never
a problem (or shouldn't be) if t_sig_check is set, and there is no pending
unblocked signal - as its name implies "check" it informs the syscall (and
trap, if they're different) exit routine(s) to look for a pending signal.
The only effect if it is set, when it "shouldn't be" is for the code to do
a little extra work determining that - if nothing is found, it should act
just the same, if fractionally slower, than if t_sig_check had not been set.

(0006319)
geoffclare (manager)
2023-06-12 08:55
edited on: 2023-06-12 14:38

Here's an attempt at wording that uses "any pending unblocked signals" and has no timing issues. I have kept the "set is not a null pointer" condition from the desired action in order to be able to refer to the signal mask change as a timing point. A side-effect of this is that if pthread_sigmask() is used just to obtain the current mask, there is no requirement on delivery of pending unblocked signals. This seems okay to me as an implementation might be able to satisfy an enquiry for the current mask without making a system call.

After applying bug 1636, change:

If there are any pending unblocked signals after the call to pthread_sigmask(), at least one of those signals shall be delivered before the call to pthread_sigmask() returns.

to:

If the argument set is not a null pointer, after pthread_sigmask() changes the currently blocked set of signals it shall determine whether there are any pending unblocked signals; if there are any, then at least one of those signals shall be delivered before the call to pthread_sigmask() returns.

On page 1736 line 56316 section pthread_sigmask(), change APPLICATION USAGE from:

None.

to:

Although pthread_sigmask() has to deliver at least one of any pending unblocked signals that exist after it has changed the currently blocked set of signals, there is no requirement that the delivered signal(s) include any that were unblocked by the change. In particular, when a signal is generated it can become pending for reasons other than being blocked (see [xref to 2.4.1]) and therefore the delivered signal(s) could be ones that were already unblocked but became pending during the pthread_sigmask() call.

(0006320)
kre (reporter)
2023-06-12 13:08
edited on: 2023-06-12 13:13

wrt Note: 0006319

The substantive change looks fine to me - I have no problem with the
"If the argument set is not a null pointer" qualification, as in that
case (whether or not this function is implemented as a system call in
that case) no pending signals can have been unblocked - if anything unblocked
had been posted earlier, it would have already been delivered. Anything that
arrives just after the call will be delivered when it arrives. Anything
that arrives at the same time as the call is being processed is just in a
race to see which of those two cases it is considered equivalent to.
Beyond that, I'd have no objection if this function were to have this whole
paragraph deleted, and not say anything about signal delivery - though it
is probably better that it remain. [Note: we do not say in write() that
if there are any pending unblocked signals when it is about to return, that
they be delivered before write returns - but if the write caused a SIGPIPE
that is what happens. It just doesn't need explicit mention there. It isn't
really required here either - just this is a somewhat odd case, where it
probably helps.]

However, assuming that we really need that APPLICATION USAGE text at all,
which I doubt, it needs a wording change.

    In particular, signals can become pending for reasons other than
    being blocked

makes no sense, being blocked never makes a signal become pending.
Being unblocked is what does that... I know what you intended to
say (or at least I can guess), which would have been fine, but those
words are not it. I suspect something more like

    In particular, signals may have become pending for reasons other
    than having previously been posted, but blocked, and unblocked
    by this call. The delivered signal(s) could be ones that just
    became pending during the pthread_sigmask() call.

(something like that, perhaps a little briefer). Put in the xref
wherever it seems to fit best.

(0006325)
kre (reporter)
2023-06-12 22:18
edited on: 2023-06-12 22:21

wrt the updated Note: 0006319

I'd simply delete the whole sentence starting "In particular..." and replace
it with something more like

    It is possible that an unrelated signal (or signals) become pending
    [xref] during the period the pthread_setmask() call is executing.
    One (or more) of those signals may be delivered to the application
    instead of, or in addition to, any previously blocked pending signals
    that were unblocked by the pthread_setmask() call.

Or something along those lines, but ideally briefer.

The new wording "when a signal is generated it can become pending for reasons
other than being blocked" is no better than the previous attempt, all signals
become pending when generated (for a short while at least, sometimes very
short) all being blocked does is prevent them from being delivered for longer
(until they are unblocked) - but they are always pending from the instant
they are generated, until the system has an opportunity to resume execution of
the process (or thread) next - in the simplest case (a system call results in
a signal being posted to the invoking process) that is at least for whatever
time remains during the processing/cleanup of the system call in question.
In other cases (a signal delivered to a process that is swapped out) the
delay before it can be resumed can be considerable - that could even happen
during a pthread_sigmask() call, if the scheduler decides to run a higher
priority process instead of the caller, which is then blocked waiting upon
a suitable CPU, and could get swapped out while it is waiting. Unlikely
perhaps on modern systems, but on some ancient systems, that kind of thing
happened all the time, and is still possible.

(0006326)
geoffclare (manager)
2023-06-13 09:29

Re Note: 0006325

> all signals become pending when generated (for a short while at least,
> sometimes very short) all being blocked does is prevent them from being
> delivered for longer (until they are unblocked) - but they are always
> pending from the instant they are generated, until the system has an
> opportunity to resume execution of the process (or thread) next

You don't seem to have considered the case where the thread that receives the signal is currently running, and is executing user code. Then there is nothing to delay the delivery - it can happen immediately after generation, if the signal is not being blocked; it only becomes pending if it is being blocked.

Having said that, it would make sense to reword to avoid any subtle distinctions about exactly when a signal becomes pending. I will try to come up with something that merges parts of your suggestion with parts of my previous attempt.

(0006327)
geoffclare (manager)
2023-06-13 09:47

New suggested resolution ...

After applying bug 1636, change:

If there are any pending unblocked signals after the call to pthread_sigmask(), at least one of those signals shall be delivered before the call to pthread_sigmask() returns.

to:

If the argument set is not a null pointer, after pthread_sigmask() changes the currently blocked set of signals it shall determine whether there are any pending unblocked signals; if there are any, then at least one of those signals shall be delivered before the call to pthread_sigmask() returns.

On page 1736 line 56316 section pthread_sigmask(), change APPLICATION USAGE from:

None.

to:

Although pthread_sigmask() has to deliver at least one of any pending unblocked signals that exist after it has changed the currently blocked set of signals, there is no requirement that the delivered signal(s) include any that were unblocked by the change. If one or more signals that were already unblocked become pending (see [xref to 2.4.1]) during the period the pthread_setmask() call is executing, the signal(s) delivered before the call returns might include only those signals.

Issue History
Date Modified	Username	Field	Change
2023-05-23 09:43	geoffclare	New Issue
2023-05-23 09:43	geoffclare	Name	=> Geoff Clare
2023-05-23 09:43	geoffclare	Organization	=> The Open Group
2023-05-23 09:43	geoffclare	Section	=> pthread_sigmask()
2023-05-23 09:43	geoffclare	Page Number	=> 1734
2023-05-23 09:43	geoffclare	Line Number	=> 56243
2023-05-23 09:43	geoffclare	Interp Status	=> ---
2023-05-23 21:08	kre	Note Added: 0006287
2023-05-25 08:40	geoffclare	Note Added: 0006288
2023-05-25 18:26	kre	Note Added: 0006292
2023-05-25 18:32	kre	Note Edited: 0006292
2023-05-25 18:44	kre	Note Added: 0006293
2023-05-30 11:14	geoffclare	Note Added: 0006294
2023-05-31 19:50	kre	Note Added: 0006296
2023-06-06 15:50	geoffclare	Note Added: 0006312
2023-06-06 19:24	kre	Note Added: 0006313
2023-06-07 09:44	geoffclare	Note Added: 0006314
2023-06-09 11:24	kre	Note Added: 0006316
2023-06-12 08:55	geoffclare	Note Added: 0006319
2023-06-12 09:08	geoffclare	Note Edited: 0006319
2023-06-12 13:08	kre	Note Added: 0006320
2023-06-12 13:13	kre	Note Edited: 0006320
2023-06-12 14:38	geoffclare	Note Edited: 0006319
2023-06-12 22:18	kre	Note Added: 0006325
2023-06-12 22:21	kre	Note Edited: 0006325
2023-06-13 09:09	dannyniu	Issue Monitored: dannyniu
2023-06-13 09:29	geoffclare	Note Added: 0006326
2023-06-13 09:47	geoffclare	Note Added: 0006327
2023-07-31 15:38	Don Cragun	Final Accepted Text	=> See Note: 0006327.
2023-07-31 15:38	Don Cragun	Status	New => Resolved
2023-07-31 15:38	Don Cragun	Resolution	Open => Accepted As Marked
2023-07-31 15:39	Don Cragun	Tag Attached: tc3-2008
2023-08-17 11:01	geoffclare	Status	Resolved => Applied
2023-08-17 11:01	geoffclare	Tag Attached: applied_after_i8d3

Relationships

Aardvark Mark IV