Re: Deadlock - restoreSessionWithID holding on to the lock ?
Re: Deadlock - restoreSessionWithID holding on to the lock ?
- Subject: Re: Deadlock - restoreSessionWithID holding on to the lock ?
- From: Chuck Hill <email@hidden>
- Date: Thu, 22 Nov 2007 10:12:59 -0800
On Nov 22, 2007, at 9:09 AM, Simon McLean wrote:
Hi All,
I'm still trying to debug a deadlock. 1 week in and i still can't
get a thread dump from jstack, jdb or -QUIT. It seems that when the
deadlock occurs the apps is just truly locked up.
Other than the app running out of memory, I can't think of what would
cause this. IIRC, you said you can get thread dumps with kill -QUIT
when the app is not deadlocked. Grep the application log for
OutOfMemory. Sometimes that is the only clue. If this is happening,
it could also explain the deadlocking. All bets are off when the JVM
runs out of memory.
However, I'm finally starting to get some information through using
the excellent ScreamingEditingContext. This one in particular looks
suspect, but it doesn't quite give me enough to figure out what is
holding on to the lock:
Stack trace for most recent lock: java.lang.Throwable at
com.clicktravel.plumbing.system.ScreamingEditingContext._trace
(ScreamingEditingContext.java:52) at
com.clicktravel.plumbing.system.ScreamingEditingContext.lock
(ScreamingEditingContext.java:113) at
com.webobjects.appserver.WOSession._awakeInContext(WOSession.java:
717) at com.webobjects.appserver.WOApplication.restoreSessionWithID
(WOApplication.java:1550) at
er.extensions.ERXApplication.restoreSessionWithID
(ERXApplication.java:1432) at
com.webobjects.appserver._private.WOComponentRequestHandler._dispatchW
ithPreparedApplication(WOComponentRequestHandler.java:314) at
com.webobjects.appserver._private.WOComponentRequestHandler._handleReq
uest(WOComponentRequestHandler.java:358) at
com.webobjects.appserver._private.WOComponentRequestHandler.handleRequ
est(WOComponentRequestHandler.java:435) at
com.webobjects.appserver.WOApplication.dispatchRequest
(WOApplication.java:1306) at
er.extensions.ERXApplication.dispatchRequest(ERXApplication.java:
1161) at com.webobjects.appserver._private.WOWorkerThread.runOnce
(WOWorkerThread.java:173) at
com.webobjects.appserver._private.WOWorkerThread.run
(WOWorkerThread.java:254) at java.lang.Thread.run(Thread.java:613)
Any ideas where I go from here ?
Assuming that the app is not running out of memory, I could just jump
to telling you where your bug is. :-) With about 90% confidence of
being correct, I predict:
1. You have a WOLongResponsePage that is using an editing context
that is managed (locked and unlocked by the session).
2. You have code in Session.sleep() that is throwing an exception not
caught by sleep().
3. You are committing Really Bad Locking in a component someplace. I
recently found this in an application:
public WOComponent checkout()
{
EOEditingContext ec; // Editing context from page, automatically locked
// do some save preparation
ec.lock();
// Make some changes
// Create some objects
ec.saveChanges();
ec.unlock();
return context().page();
}
This works all fine (aside from the unneccessary locks and unlocks)
until saveChanges() raises an exception. Then the ec has an extra
lock on it and the next request for the session produces the stack
trace you have above. "But, but, I know that saveChanges() will
never throw a validation exception in this case!" protests the
original developer. Yes, but what about one of those rare optimistic
locking failures from the database...?
If you must lock manually, use try...finally:
{
EOEditingContext ec; // Editing context from page, automatically locked
// do some save preparation
ec.lock();
try {
// Make some changes
// Create some objects
ec.saveChanges();
} finally {
ec.unlock();
}
return context().page();
}
I also have a lock screamer that is modified (IIRC) to provide some
better debug info. Here is the lock() method:
public void lock()
{
String nameOfCurrentThread = Thread.currentThread().getName();
// If we have not recorded any traces this editing context
is not currently
// locked and this results in a the lock being taken by a
new thread.
if (_stackTraces.count() == 0)
{
_stackTraces.addObject(_trace());
_nameOfLockingThread = nameOfCurrentThread;
NSLog.err.appendln("+++ Lock number (" +
_stackTraces.count() + ") for " + this +
" in " + nameOfCurrentThread);
}
else // This editing context has already been locked.
{
// If the thread is the same then this is a secondary
call that
// results in an increased recursionCount() for the
NSRecursiveLock.
if (nameOfCurrentThread.equals(_nameOfLockingThread))
{
_stackTraces.addObject(_trace());
NSLog.err.appendln("+++ Lock number (" +
_stackTraces.count() + ") for " + this +
" in " + nameOfCurrentThread);
}
// If the thread is not the same it will block. For
editing contexts
// in a session this results in deadlock so an error
message is output.
// The error includes the trace of the most recent lock
taken which is
// probably the offending unreleased lock. It might not
be if your
// lock and unlocks are not cleanly nested. In that
case you might
// need to capture and display traces of all the lock
and unlock calls
// to find what is not nested correctly.
else
{
NSLog.err.appendln("!!! Attempting to lock editing
context " + this + " from " +
nameOfCurrentThread + " that was previously
locked in " + _nameOfLockingThread);
NSLog.err.appendln("!!! Current stack trace: \n" +
_trace());
NSLog.err.appendln("!!! Stack trace for most recent
lock: \n" + _stackTraces.lastObject());
}
}
super.lock();
}
That should point you to the thread that still has the hanging lock.
If you are logging out the session ID and thread name in your app log
(and you should be!) you can work back in the log to see what the
user was doing that caused this.
Chuck
--
Practical WebObjects - for developers who want to increase their
overall knowledge of WebObjects or who are trying to solve specific
problems.
http://www.global-village.net/products/practical_webobjects
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden