Lists

Terms and Conditions
Lists hosted on this site
Email the Postmaster
Tips for posting to public mailing lists

Re: Deaths = broken pipe?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Deaths = broken pipe?

Subject: Re: Deaths = broken pipe?
From: Chuck Hill <email@hidden>
Date: Mon, 26 Jan 2009 13:21:58 -0800


On Jan 23, 2009, at 11:11 AM, Kieran Kelleher wrote:

On one slow G4 xserve, I have had situation about once month whereby I get an email (presumably from wotaskd) from the specific server reporting one or two "Deaths" on an instance running on that machine. I look at the time of the email and compare that to the logs, I don't see the usual startup stuff in the log, so it makes me wonder if the app really restarted at all.

Possibly not. Is it setup to dispatch requests concurrently? If wotaskd failed to talk to the app before timing out and this happened two (or three?) times in a row, it will mark it as "dead" even though the process is still running.

Secondly, the last entry before the time of a "Death" is usually something like:

WARN 2009-01-23 13:25:49,844 [WorkerThread5] (NSLog: 43) - <WOWorkerThread id=5 socket=Socket[addr=/ 192.168.1.149,port=58070,localport=2001]> Exception while sending response: java.net.SocketException: Broken pipe

That probably means that it took so long to return the result that Apache has given up on the request or the user hit Stop or navigated elsewhere. In other situations, this usually means the user navigated elsewhere before getting a response. So, usually, this is something you can safely ignore. Usually. But it might also indicate you app is processing some requests too slowly.

If I go look at the WOStats page for the instance it tells me that this instance is up for nearly 7 days, so that would seem to indicate that the app really did not crash and restart ..... also no entry in /Library/Logs/CrashReporter for that time of the "Death".

Death means that wotaskd could not communicate with the instance, but not necessarily that the process stopped.

So my question is if this is a case where the response taking too long is being considered a "Death", but the app is not really dying and being restarted at all?


Yes, exactly.

I am guessing that a user is accessing a specific page that is loading a lot of data and they get a no instance available, but this is not killing the app (BTW, concurrent request handling is on) ... is this the case?


Yes, that is my interpretation of what you are describing.

Secondly, I try to determine which page may have broken the pipe by looking at WOStats, but I find most actions have a max of ~2 seconds and perhaps 3 pages have a max of ~11 seconds although their average is less than 1 second, so there is no action near ~30 seconds ... how would I determine the page that caused the broken pipe? .... or should I just focus on ensuring that those pages that did have a high max of ~11 secs get reqorked so their max never exceeds ~2 secs?

I am not sure that those stats are 100% trustworthy. I log my own dispatch times into the app log. I also log what is happening in each thread which makes tracking down these sorts of problems very easy.


Chuck

--
Chuck Hill             Senior Consultant / VP Development

Practical WebObjects - for developers who want to increase their overall knowledge of WebObjects or who are trying to solve specific problems. http://www.global-village.net/products/practical_webobjects


_______________________________________________
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden



References:  
  >Deaths = broken pipe? (From: Kieran Kelleher <email@hidden>)




Prev by Date:
Re: Maven, War and Wonder

Next by Date:
Re: Sometime WebObjects drives me a little crazy!

Previous by thread:
Re: Deaths = broken pipe?

Next by thread:
EOQualifiers

Index(es):

Date
Thread