Re: select() weirdness
Re: select() weirdness
- Subject: Re: select() weirdness
- From: Justin Walker <email@hidden>
- Date: Fri, 25 Feb 2005 18:06:11 -0800
On Feb 25, 2005, at 17:34, Steve Dekorte wrote:
On Feb 25, 2005, at 5:18 PM, Agent M wrote:
The funnel issue mentioned by Quinn could pose the issue but unlikely
for 20 secs. Have you used a time counter within the program or done
profiling which confirms the issue? You should definitely run Shark
as Quinn mentioned to confirm that your program is blocking on select
and not on something else- you may simply have a starved coroutine,
if this is io you're working on.
Also, count the number of calls you can make before the weird
blocking and see if it consistent.
How many sockets are connecting simultaneously to this server?
You issue does sound odd, so I wouldn't rule out an OS bug, but
double-check all the inputs into select- I usually forget that select
changes most of its pointer arguments on return.
Also check for possible interfering signals- select could be
interrupted (EINTR) and you must catch that.
Some further poking around revealed that select() was not the culprit
(though the fact that gdb always broke on select() made it suspicious)
the problem was that some socket connects take a mysteriously long
time or timeout altogether. This occurs whether I'm hitting an apache
server or a mini server I've written for testing.
The solution seems to be setting the Socket connect timeout lower and
just dealing with the failed connections. I'd really like to know why
a connect would ever fail in this situation though. Is > 6,000
connections per second too much to ask for?
Possibly. It depends on how much memory you have; how many ephemeral
ports you have for connections; and some random variables I've
forgotten.
To debug this, it might be of interest to know:
- is there a low-level error, or is "connect()" itself really hanging?
- what is happening on the wire when the stall occurs (tcpdump).
- what does "netstat" say when you are in this condition? If you
have a lot
of connections in close-wait or closed state, that might be the
answer
(although I'd be hard-pressed to explain the hang if that's the
case).
If there is really a hang, I'd look elsewhere for an answer (e.g.,
off-host or in a library somewhere).
Regards,
Justin
--
Justin C. Walker, Curmudgeon-At-Large *
Institute for General Semantics | Men are from Earth.
| Women are from Earth.
| Deal with it.
*--------------------------------------*-------------------------------*
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Darwin-kernel mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden