Hang after restarting an interrupted call to connect()?
Hang after restarting an interrupted call to connect()?
- Subject: Hang after restarting an interrupted call to connect()?
- From: Malcolm Rowe <email@hidden>
- Date: Thu, 25 Aug 2005 22:43:50 +0100
[If this is not the correct list for this kind of message, many apologies.]
Hello list,
I'm trying to chase up a problem that was originally reported against
Subversion (http://svn.haxx.se/dev/archive-2005-08/1071.shtml). While the
original problem has now been worked-around in Subversion itself, I'm trying
to understand if the generic problem is caused by the way Subversion was
working, or whether it's caused by a Darwin kernel bug.
[I'm using whatever ships with Mac OS X 10.4.2: Darwin 8.2.1, if I'm reading
it right.]
The situation is this: The client process sets up a signal handler for
SIGINT and then calls connect(). connect() waits for a connection [it will
eventually get a timeout if allowed to proceed], but the user becomes
bored and tries to cancel the attempt with ^C. SIGINT is delivered to the
signal handler, which sets a global variable to indicate that the process
should terminate and then calls signal(SIGINT, SIG_IGN) to ignore future
SIGINT signals.
Here's where it gets interesting. connect() returns EINTR, and APR
automatically restarts it (by calling connect() again). Now, however, the
connect() call hangs, no further signals (even SIGKILL or SIGSEGV) will be
delivered to the process (as visible via ktrace), and, most interestingly,
all further calls to connect() from other processes hang as well!
Note that it's important the a signal handler is initially provided, so
that connect() will initially return EINTR: simply calling signal(SIGINT,
SIG_IGN) first doesn't cause the problem to occur.
An example program to demonstrate this problem is included below. It relies
on connections to port 3690 on svn.edgewall.com timing out, but any
non-responsive port should work.
Steps to replicate:
1. Run the program below.
2. Deliver a SIGINT by hitting ^C.
3. Witness the process become unkillable, and the entire network stack grind
to a halt.
4. Perform a hard power-off.
Anyone have any ideas?
Thanks,
Malcolm
#include <errno.h>
#include <signal.h>
#include <strings.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
void sighandler(int signum)
{
signal(signum, SIG_IGN);
}
int main(void)
{
signal(SIGINT, sighandler);
int fd = socket(AF_INET, SOCK_STREAM, 0);
if (fd == -1)
{
perror("socket() failed");
return 0;
}
struct sockaddr_in sin;
bzero(&sin, sizeof(sin));
sin.sin_family = AF_INET;
sin.sin_port = htons(3690); /* svn */
sin.sin_addr.s_addr = htonl(1210360707); /* svn.edgewall.com */
int rc;
do {
rc = connect(fd, (struct sockaddr*)&sin, sizeof(sin));
} while (rc == -1 && errno == EINTR);
if (rc == -1)
perror("connect() failed");
close(fd);
return 0;
}
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Darwin-kernel mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden