Re: Newbie query re multithreading
Re: Newbie query re multithreading
- Subject: Re: Newbie query re multithreading
- From: Bill Bumgarner <email@hidden>
- Date: Tue, 18 Aug 2009 10:46:21 -0700
On Aug 18, 2009, at 10:19 AM, McLaughlin, Michael P. wrote:
On 8/18/09 11:34 AM, "Bill Bumgarner" <email@hidden> wrote:
On Aug 18, 2009, at 7:25 AM, McLaughlin, Michael P. wrote:
1) With 1 CPU (NSOperation not used), I get 50% of CPU-time
devoted to
auto_collection_thread which I presume means Garbage Collection. Is
this
normal? It seems excessive.
It sounds like your algorithm is doing a tremendous amount of memory
thrashing as it runs; allocating and deallocating tons of objects.
Under GC on Leopard, the collector will spend a ton of CPU cycles
trying to keep up.
I suspect that this much is true. This code contains a huge number of
matrices and vectors, allocated locally and allowed to go out of scope
without explicit freeing (given GC). I am not sure of the internals
of the
Eigen library but that seems to be nearly all templated headers.
(Forgive me if I'm rehashing some basics here -- I just want to make
sure we
The only way that C++ objects would be visible to the collector is if
you overrode the class(es) constructor to allocate memory from the
collector's zone via NSAllocateCollectable().
Otherwise, the C++ objects aren't visible to the collector.
Are your matrices and vectors allocated as Objective-C objects?
2) With 2 CPUs, I get 26% of CPU-time for auto_collection_thread and
another
26% for mach_msg_trap --> CFRunLoopSpecific (apparently back to
Carbon
again).
CF is CoreFoundation, not Carbon.
Yes, but CFRunLoopSpecific leads to RunCurrentEventLoopInMode which is
HIToolbox (supposedly Carbon).
The internals of CFRunLoop are an implementation detail subject to
change at any time. :)
Do you have something in your concurrent code that is sending
messages
back to the main event loop or posting events to other thread's run
loops? This can be problematic, too, in that you can end up
consuming tons of CPU cycles in message passing.
At the end of each NSOperation, there is a single
enqueueNotification just
before the thread main() exits [for higher-level bookkeeping]. I use
enqueue instead of post because it is asynchronous. Otherwise, the
calls
would pile up and the caller stack frame would not terminate until
the whole
program was finished.
OK -- that sounds reasonable.
How about data access? I.e. do multiple threads update some subset of
objects simultaneously? Do you have lock contention along those code
paths?
If your compute threads are syncrhonously -- waitUntilExit: YES'ing
--
between threads, you can quite easily see a "reverse scaling"
performance profile as you are now.
Not really sure what this means. The caller, at each timestep,
looks like
this:
A common -- and often fatally non-performant -- pattern is to use:
[someView performSelectorOnMainThread: @selector(updateNowPlease:)
withObject: self waitUntilDone: YES];
If the main event loop is processing lots of these or is otherwise
blocked, this pattern very quickly becomes a massive bottleneck.
Nwaiting = numProcessors;
for (int op = 0;op < numProcessors;op++) {
compOperation *thisOp = [[compOperation alloc]
initWithCompData:&compData
ID:op
step:timeNdx
user0:firstUser[op]
userN:lastUser[op]
queue:[NSNotificationQueue defaultQueue]];
[opQueue addOperation:thisOp];
}
More-or-less copied from Apple's NSOperation sample code. As noted,
each
operation enqueues an "I'm finished" message to this caller.
Nwaiting is
decremented upon receipt. compData is a read-only structure
containing
global data.
That looks generally reasonable.
The one red flag is numProcessors. Generally, you shouldn't design
concurrency around the # of processors on Mac OS X. The operation
queue (and, in Snow Leopard, Grand Central Dispatch -- http://www.apple.com/macosx/technology/#grandcentral
) should throttle appropriately to maximize throughput and do so in
consideration of other applications on the system.
Do you have access to Snow Leopard? If so, use it. The tool chain
for analyzing and improving concurrent applications is vastly
improved. Even if you are going to continue to target Leopard, the
analysis and debugging improvements will make your job easier.
Yes and no. I do have it, at home, but have not installed it
because I have
only a single Intel Mac and I was unsure of the feasibility of
installing
two Developer folders on a single hard-drive partition. Sounded like
trouble to me ;-)
To run the SL dev tools, you need to have SL installed.
Mac OS X will boot quite happily from an external hard drive, a second
partition or -- even -- an SDHC card shoved in a USB based SD reader
(I occasionally boot my MacBook Pro from an SD card shoved in an
ExpressCard/34 reader).
Here, at work where I develop the app under discussion, I have only
a G5 and
am unlikely to improve on that in the foreseeable future.
You wouldn't happen to be writing a 64 bit application, would you?
b.bum
_______________________________________________
Cocoa-dev mailing list (email@hidden)
Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden