Re: WoMonitor "Failed to contact ..."
Re: WoMonitor "Failed to contact ..."
- Subject: Re: WoMonitor "Failed to contact ..."
- From: Kieran Kelleher <email@hidden>
- Date: Thu, 30 Apr 2009 22:12:30 -0400
On Apr 30, 2009, at 8:44 PM, Chuck Hill wrote:
On Apr 30, 2009, at 5:16 PM, Kieran Kelleher wrote:
Hi Chuck,
Great news ..... I have had 3 error emails already from Wonder's
SessionStore Deadlock detection!
"I would wager that I will not see any Sesion Store deadlocks with
concurrent request handling turned off!"
Looks like I won the bet! :-P
Yes, you did. I was wrong! I owe you an expensive Starbucks drink at
least :-P
Here they are .... session IDs modified to protect the innocent :-)
Not sure what anyone could tell from those. If you have restarted
the apps, they are useless now.
More good news ...... a side-effect of using ERX session store
deadlock detection and/or concurrent req handling=false is that there
are no CLOSE_WAITS and there are *no* threads blocked on
com.webobjects.appserver.WOSessionStore.checkOutSessionWithID(...),
not a single one in both of the instances that report these errors. So
no restart needed right now, albeit my poor user is getting the custom
error page and having to log in again to do their thing.
What do you make of this then?
You have a bug in your code.
I agree.
Not sure how much it tells me. What logging would you recommend now
that we have this detection happening? ... should I override
ERXApplication's SessionInfo to capture some more info? ... if so,
what?
Well, you need to ask yourself what in your code immediately
_preceeded_ this. I try to have enough logging that I can tell what
the user is doing just from looking at the log. Something like this:
Wonder normally gives me a detailed context information such as
previous page history and the exact component (actuallt nested
component tree) that was invoked to cause the error, but it seems
SessionInfo does not call that utility method
(ERXApplication#extraInformationForExceptionInContext(exception,
context)) .... I am going to try adding that to get the full picture
and redeploying.
[2009-04-30 16:23:19,627] <WorkerThread13>
WOApplication.dispatchRequest - =========== Cycle Start for URI /cgi-
bin/WebObjects/FooApp.woa/-3456/wo/3.0.9.3.1.3.1?
0.9.3.1.3.1.1.10.0
=
&0.9.3.1.3.1.1.12.0
=
S
&0.9.3.1.3.1.1.14.0
=
&0.9.3.1.3.1.1.16.0.0
=
&0.9.3.1.3.1.1.18
=
WONoSelectionString
&0.9.3.1.3.1.1.20=WONoSelectionString&0.9.3.1.3.1.1.29=Search
[2009-04-30 16:23:19,628] <WorkerThread13>
WOApplication.dispatchRequest - Request's headers: {remote_host =
("::1"); user-agent = ("Mozilla/5.0 (Macintosh; U; Intel Mac OS X
10_5_6; en-us) AppleWebKit/525.27.1 (KHTML, like Gecko) Version/
3.2.1 Safari/525.27.1"); document_root = ("/Library/WebServer/
Documents"); server_name = ("localhost"); accept = ("text/
xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/
plain;q=0.8,image/png,*/*;q=0.5"); remote_addr = ("::1");
server_admin = ("email@hidden"); remote_port = ("51682");
connection = ("close"); server_software = ("Apache/2.2.9 (Unix)
mod_ssl/2.2.9 OpenSSL/0.9.7l DAV/2"); cookie = ("woinst=-3456;
wosid=Udis2u5xusYyWaXgb8eRXg;"); host = ("localhost"); referer = ("http://localhost/cgi-bin/WebObjects/FooApp.woa/-3456/wo/2.0.9.1.1.23.0.1.3.1.1.1
"); accept-encoding = ("gzip, deflate"); x-webobjects-request-id =
("49ea0411000000200001755e"); script_uri = ("http://localhost/cgi-bin/WebObjects/FooApp.woa/-3456/wo/3.0.9.3.1.3.1
"); script_filename = ("/cgi-bin"); x-webobjects-request-method =
("GET"); x-webobjects-adaptor-version = ("Apache"); accept-language
= ("en-us"); script_url = ("/cgi-bin/WebObjects/FooApp.woa/-3456/wo/
3.0.9.3.1.3.1"); server_port = ("80"); }
[2009-04-30 16:23:19,628] <WorkerThread13>
WOApplication.dispatchRequest - cookies: woinst=-3456;
wosid=Udis2u5xusYyWaXgb8eRXg;
[2009-04-30 16:23:19,628] <WorkerThread13>
WOApplication.dispatchRequest - form values:
{0.9.3.1.3.1.1.10.0 = (""); 0.9.3.1.3.1.1.12.0 = ("S");
0.9.3.1.3.1.1.14.0 = (""); 0.9.3.1.3.1.1.18 =
("WONoSelectionString"); 0.9.3.1.3.1.1.16.0.0 = ("");
0.9.3.1.3.1.1.29 = ("Search"); 0.9.3.1.3.1.1.20 =
("WONoSelectionString"); }
[2009-04-30 16:23:19,629] <WorkerThread13>
ERXNSLogLog4jBridge.appendln - Awake session Udis2u5xusYyWaXgb8eRXg
[2009-04-30 16:23:19,629] <WorkerThread13>
ERXNSLogLog4jBridge.appendln - ====================================
[2009-04-30 16:23:19,629] <WorkerThread13> FooAppSession - User chill2
[2009-04-30 16:23:19,629] <WorkerThread13> FooAppThread - Attaching
chill2 to thread WorkerThread13
[2009-04-30 16:23:19,630] <chill2 WorkerThread13> FooAppSession -
Request being handled by net.foo.FooApp.pages.adcontrol.EditDisplayAds
[2009-04-30 16:23:19,645] <chill2 WorkerThread13>
FooAppEditingPageSearchComponent - Searching for: {adName =
(order.adName caseinsensitivelike '*S*'); excludeCompletedCycles =
(publications CONTAINS at least 1 match(es) for ((isCycleComplete =
(java.lang.Boolean)'false')); } {}
[2009-04-30 16:23:19,647] <chill2 WorkerThread13>
FooAppEditingPageSearchComponent - Qualifier ((order.adName
caseinsensitivelike '*S*') and (publications CONTAINS at least 1
match(es) for ((isCycleComplete = (java.lang.Boolean)'false')))
[2009-04-30 16:23:19,647] <chill2 WorkerThread13> FooAppSession -
Response is net.foo.FooApp.pages.adcontrol.EditDisplayAds
[2009-04-30 16:23:19,698] <chill2 WorkerThread13>
FooAppAjaxGridComponent - Search matched: 0
[2009-04-30 16:23:19,698] <chill2 WorkerThread13>
FooAppEditingPageComponent - Search Results: rendering
net.foo.FooApp.pages.adcontrol.DisplayAdResults
So, in the absence of that, what can we tell?
user-agent = (Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US)
Windows loser. :-P
script_uri = (http://www.smartleadsusa.net/app/20/wo/kueXGID3VjTiMRY*******/4.0.21.13.1.1.0.5.1.1.3.0.3
);
That tells us that it is a component action, not a direct action
causing this.
Interestingly I just noticed that
ERXApplication#handleActionRequestError probably prevents direct
action session store lockups as it calls
saveSessionForContext(...) .... good to know.
They were on their fourth click. You have something triggering this
that has an element ID of 0.21.13.1.1.0.5.1.1.3.0.3
I'd search the HTML source for that (e.g. the source in the
browser). That will be the link that causes the error
script_uri = (http://www.smartleadsusa.net/app/20/wo/kueXGID3VjTiM*******/
4.0.21.13.1.1.0.5.1.1.3.0.3);
script_uri = (http://www.smartleadsusa.net/app/22/wo/tfVVfuWHuybcGbF*******/3.0.21.13.1.1.0.5.1.1.3.0.3
);
It is always the same link / submit that causes this.
It is always the same IP: 67.61.91.110
Excellent observation Sherlock!
Perhaps someone has garbled data in their account? They are
persistent.
Since this app is a login-only app for a known user-base, I am going
to use ERX SessionInfo to capture the current user (their gid is
already in ERXThreadStorage during each request) .... and I can always
get their customer service rep to call them and ask exactly what they
did before the error happened aswell as checking their account data
for anomalies on a copy of the db which I will copy from mysql slave
to my dev machine tonight.
Thanks Chuck. This is an interesting one for sure. I will follow up
with more findings.
Chuck
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden