Re: WOWorkerThread deadlocks
Re: WOWorkerThread deadlocks
- Subject: Re: WOWorkerThread deadlocks
- From: Michael Gargano <email@hidden>
- Date: Tue, 15 Jan 2013 21:13:45 +0000
- Thread-topic: WOWorkerThread deadlocks
On Jan 15, 2013, at 2:54 PM, Chuck Hill wrote:
>
> On 2013-01-15, at 10:50 AM, Michael Gargano wrote:
>
>> Hi guys,
>>
>> I know I'm a little late on this, but I'm also seeing the same behavior. It's not a long running query I don't think because I'm logging long queries in postgres and nothing is running over 10 seconds. Can you explain why having a max of 256 worker threads is too high?
>
> http://osdir.com/ml/web.webobjects.admin/2005-02/msg00006.html
>
> Keep in mind that you have 256 threads all trying to do something that usually sooner or later needs a single threaded EOF lock. That is just not going to make for happy users.
Thanks. I'll take a look at this.
>
>
>> Any other things I should look at? The customers are not happy!
>
> Cut down the number of worker threads and the listen queue size. It won't fix the problem but at least (a) you will see it sooner and (b) the app can recover.
>
>
>> My last problem did turn out to be a bunch of deadlocks, which all now seem to be resolved. It had to with setting er.extensions.ERXObjectStoreCoordinatorPool.maxCoordinators=4 which should be seamless (you would think) but causes issues with fetch specs that have EOs crossing OSCs.
>
> Why on earth would an EO ever cross an OSC? They don't even cross ECs.
a page creates a new EC.... gets an EO... that EO is passed around, is on another (or the same) page where another EC is created, when a fetchSpec is run against the new EC, but the other EO is used as part of the fetchSpec those ECs can be associated with two different OSCs, the new EC just created and the EC associated with the EO we already have a reference to. once i called localInstance on every EO being used like that all the deadlocks went away.
>
>
>> I had to pull all EOs local, seems like something that should be handled inside wonder automatically (so I consider it a bug, whether it is or not could be argued I guess). Anyway, after those all got fixed, I'm now running into this. Much harder to figure out since I don't even know what the lock is held on.
>
> sudo jstack -F <process id>
> will show you if it is a deadlock. Otherwise it is likely bad exception handling that results in your code doing a lock() and never doing an unlock()
>
no deadlocks are being detected and i don't see any either. i see the same thing Maik saw, all the worker threads are waiting on a lock held by one worker thread which is in a run state and awaiting a socket accept. I did searches across all the code and there is no manual locking anywhere, everything is through the autolocking of wonder.
>
> Chuck
>
>>
>> BTW Chuck and Quinton, I owe you guys a beer. Thanks for pointing me in the right direction on the last problem.
>>
>> Thanks for any help.
>> -Mike
>>
>> -----Original Message-----
>> From: webobjects-dev-bounces+mgargano=email@hidden [mailto:webobjects-dev-bounces+mgargano=email@hidden] On Behalf Of Chuck Hill
>> Sent: Monday, September 10, 2012 1:24 PM
>> To: Maik Musall
>> Cc: email@hidden WebObjects
>> Subject: Re: WOWorkerThread deadlocks
>>
>> Hi Maik,
>>
>> "WorkerThread207" that many worker threads indicates two things to me:
>> 1. Your app configuration is too high. I'd use a max of 6-10 and a listen queue size of around 4 (adjusted to suit your specific needs). A WO app is very, very unlikely to recover from a 200 worker thread backlog in any way that is useful to the users
>>
>> 2. You have a thread that is taking a long time to return a result. If you are dispatching requests concurrently, then this is most likely stuck in EOControl/EOAccess (e.g. waiting for a slow query result) or connecting to some external process. You could also have a deadlock. If you are not dispatching requests concurrently, then this delay could be in other code.
>>
>> The traces below do not show the problem. If you want to send a full dump, I am willing to look at it. It is possible that the problem had resolved by the time you took this dump. What you show below is normal for a lot of worker threads. WorkerThread206 is waiting for a new request, WorkerThread207 is idle waiting for something to do in the future.
>>
>> Chuck
>>
>>
>> On 2012-09-10, at 8:03 AM, Maik Musall wrote:
>>
>>> Hi,
>>>
>>> in an app with high concurrency, the app sometimes becomes unresponsive to everything but DirectActions at the time of day with the most concurrency. All users aren't seeing responses any more. In jstack I see hundreds of these:
>>>
>>>> "WorkerThread207" prio=5 tid=131e0a800 nid=0x151aa2000 waiting for monitor entry [151aa1000]
>>>> java.lang.Thread.State: BLOCKED (on object monitor)
>>>> at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:406)
>>>> - waiting to lock <20d3da450> (a java.net.SocksSocketImpl)
>>>> at java.net.ServerSocket.implAccept(ServerSocket.java:462)
>>>> at java.net.ServerSocket.accept(ServerSocket.java:430)
>>>> at com.webobjects.appserver._private.WOWorkerThread.run(WOWorkerThread.java:210)
>>>> at java.lang.Thread.run(Thread.java:680)
>>>
>>> all waiting on the same lock 20d3da450, and one thread holding that lock:
>>>
>>>> "WorkerThread206" prio=5 tid=131d79800 nid=0x15199f000 runnable [15199e000]
>>>> java.lang.Thread.State: RUNNABLE
>>>> at java.net.PlainSocketImpl.socketAccept(Native Method)
>>>> at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:408)
>>>> - locked <20d3da450> (a java.net.SocksSocketImpl)
>>>> at java.net.ServerSocket.implAccept(ServerSocket.java:462)
>>>> at java.net.ServerSocket.accept(ServerSocket.java:430)
>>>> at com.webobjects.appserver._private.WOWorkerThread.run(WOWorkerThread.java:210)
>>>> at java.lang.Thread.run(Thread.java:680)
>>>
>>> Anyone familiar with this problem?
>>>
>>> Maik
>>> _______________________________________________
>>> Do not post admin requests to the list. They will be ignored.
>>> Webobjects-dev mailing list (email@hidden)
>>> Help/Unsubscribe/Update your Subscription:
>>>
>>> This email sent to email@hidden
>>
>> --
>> Chuck Hill Senior Consultant / VP Development
>>
>> Practical WebObjects - for developers who want to increase their overall knowledge of WebObjects or who are trying to solve specific problems.
>> http://www.global-village.net/gvc/practical_webobjects
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>> Do not post admin requests to the list. They will be ignored.
>> Webobjects-dev mailing list (email@hidden)
>> Help/Unsubscribe/Update your Subscription:
>>
>> This email sent to email@hidden
>>
>>
>
> --
> Chuck Hill Senior Consultant / VP Development
>
> Practical WebObjects - for developers who want to increase their overall knowledge of WebObjects or who are trying to solve specific problems.
> http://www.global-village.net/gvc/practical_webobjects
>
> Global Village Consulting ranks 13th in 2012 in BIV's Top 100 Fastest Growing Companies in B.C!
> Global Village Consulting ranks 76th in 24th annual PROFIT 200 ranking of Canada’s Fastest-Growing Companies by PROFIT Magazine!
>
>
>
>
>
>
>
>
>
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden