site_archiver@lists.apple.com Delivered-To: darwin-kernel@lists.apple.com On Feb 25, 2005, at 17:34, Steve Dekorte wrote: How many sockets are connecting simultaneously to this server? Regards, Justin -- Justin C. Walker, Curmudgeon-At-Large * Institute for General Semantics | Men are from Earth. | Women are from Earth. | Deal with it. *--------------------------------------*-------------------------------* _______________________________________________ Do not post admin requests to the list. They will be ignored. Darwin-kernel mailing list (Darwin-kernel@lists.apple.com) Help/Unsubscribe/Update your Subscription: http://lists.apple.com/mailman/options/darwin-kernel/site_archiver%40lists.a... On Feb 25, 2005, at 5:18 PM, Agent M wrote: The funnel issue mentioned by Quinn could pose the issue but unlikely for 20 secs. Have you used a time counter within the program or done profiling which confirms the issue? You should definitely run Shark as Quinn mentioned to confirm that your program is blocking on select and not on something else- you may simply have a starved coroutine, if this is io you're working on. Also, count the number of calls you can make before the weird blocking and see if it consistent. You issue does sound odd, so I wouldn't rule out an OS bug, but double-check all the inputs into select- I usually forget that select changes most of its pointer arguments on return. Also check for possible interfering signals- select could be interrupted (EINTR) and you must catch that. Some further poking around revealed that select() was not the culprit (though the fact that gdb always broke on select() made it suspicious) the problem was that some socket connects take a mysteriously long time or timeout altogether. This occurs whether I'm hitting an apache server or a mini server I've written for testing. The solution seems to be setting the Socket connect timeout lower and just dealing with the failed connections. I'd really like to know why a connect would ever fail in this situation though. Is > 6,000 connections per second too much to ask for? Possibly. It depends on how much memory you have; how many ephemeral ports you have for connections; and some random variables I've forgotten. To debug this, it might be of interest to know: - is there a low-level error, or is "connect()" itself really hanging? - what is happening on the wire when the stall occurs (tcpdump). - what does "netstat" say when you are in this condition? If you have a lot of connections in close-wait or closed state, that might be the answer (although I'd be hard-pressed to explain the hang if that's the case). If there is really a hang, I'd look elsewhere for an answer (e.g., off-host or in a library somewhere). This email sent to site_archiver@lists.apple.com
participants (1)
-
Justin Walker