Project

General

Profile

Bug #4861

libnsl: The timeout implementation using alarm()/longjmp() is dangerous

Added by Marcel Telka almost 6 years ago.

Status:
In Progress
Priority:
Normal
Assignee:
Category:
lib - userland libraries
Start date:
2014-05-12
Due date:
% Done:

0%

Estimated time:
Difficulty:
Medium
Tags:
needs-triage

Description

There are several places in libnsl where the following (or similar) construction is used:

if (setjmp(Sjbuf)) {
        ... /* Sometimes t_close() is called here */
        return (FAIL);
}
(void) signal(SIGALRM, alarmtr);
(void) alarm(...);

/* some syscall, like open(), or read(), or a TLI/XTI function call */

alarm(0);

Where alarmtr() basically calls longjmp(Sjbuf, 1).

Such code have several problems. First, it does not work as intended. The apparent intent is to cancel the outstanding call (either a direct syscall, or a syscall called via a TLI/XTI function). The problem is that the signal handler (alarmtr) is called only after the outstanding syscall is returned by its own. There is no ability to interrupt the syscall and fire the alarm signal (and its signal handler) sooner. This also does not guarantee in any way the time spent in the syscall before the actual cancellation happens.

Second, if a TLI/XTI function is interrupted in the middle by the alarm, we might leak some internal TLI/XTI structures (memory), or other resources. See bug #4850 for an example.

Third, the TLI/XTI functions are not declared as Async-Signal-Safe (see t_close(3nsl)), so it is incorrect to call them from the signal handler. We do that effectively when we call t_close() once the setjmp() returns non-zero. See above for an example of such a call.


Related issues

Related to illumos gate - Bug #4850: File descriptor leak in tlicall()In Progress2014-05-05

Actions

Also available in: Atom PDF