Re: How to support larger NSView hierarchy?
Re: How to support larger NSView hierarchy?
- Subject: Re: How to support larger NSView hierarchy?
- From: Keith Knauber <email@hidden>
- Date: Wed, 14 Mar 2012 20:15:15 +0000
- Thread-topic: How to support larger NSView hierarchy?
Long time no response... had to fight other fires for a while.
On Feb 27, 2012, at 10:46 PM, Graham Cox wrote:
Perhaps it would be better to explain your goals, rather than fragments of an implementation that appears to be, on the face of it, pointless.
In my case, there are 3 pieces involved:
1) Cocoa based GUI app main thread
2) GUI app real-time thread, which sends commands to
3) lightning fast CPU/GPU intensive OpenGL app (definitely not Cocoa)
The goal is to update this cocoa GUI app at up to 30 fps, with as little impact on my real-time thread and a separate apps' video frame rate as possible.
Of course, end-users know that there will be some impact, but its often convenient to run both apps on the same machine.
So, to your point, I took a step back and measured performance of all three together.
The performance of cocoa GUI drawing does have an impact on other threads/apps, but it's hard to measure.
So, all that follows is not *quite* as important to me as I originally thought it was, but its still important.
Here's more data on Cocoa drawing performance of large view hierarchies.
Test case redraws my GUI app hierarchy of ~120 views at 30fps, with ~10 dirty NSControls, ~10 dirty custom views, & 1 dirty NSOpenGLView
"worst" is worst case real-time thread execution
"average" is average real-time thread execution time
"use DirtyRectHelper" YES means use offscreen drawing improvements, NO means use standard Cocoa
"% CPU separate app" means run a separate video app which I send commands to, which is very CPU/GPU intensive,
and which also provides a low bandwidth video stream for my app to draw.
"throttled" means skip GUI app draw cycles if machine is heavily loaded.
worst average use DirtyRectHelper % CPU for my app (activity monitor) % CPU separate app
1.0 ms 0.4 ms YES 3% window minimized, no drawing 0%
4.8 ms 0.4 ms YES 16% throttled 0
5.2 ms 0.6 ms YES 18% not throttled 0
6.3 ms 1.0 ms YES 18% throttled 150%
6.7 ms 1.0 ms YES 25% not throttled 150%
0.8 ms 0.4 ms NO 3% window minimized, no drawing 0
4.5 ms 0.4 ms NO 22% throttled 0
4.0 ms 0.5 ms NO 27% not throttled 0
6.0 ms 1.0 ms NO 22% throttled 150%
7.0 ms 1.0 ms NO 34% not throttled 150%
Notes:
- its clear that heavy machine activity affects my real-time thread...
- Real-time thread delays translate to latency/jitter in my case.
- Mac OS X performs well as a real-time system. I'm not complaining about the scheduler.
- messing with thread priorities creates a can of worms, and always makes the worst case even worse. Same with 'yield', usleep, nice, or anything of the sort.
If anyone can point me to a method of 'yielding' that's actually working for them, I would love to hear it.
- throttling my UI thread (by inserting usleep into displayIfNeeded) reduces average latency ~15% (not much)
- worst case real-time thread latency varies wildly, but I've never seen it go above 12ms even with several beachballing apps.
performance stats, collected over a 10 second period (worst case, unthrottled scenario)
- Cocoa draws for 4.0/10 seconds vs. DirtyRectHelper 2.7/10 seconds
- Cocoa calls visibleRect & convertRect 450k times vs. DirtyRectHelper 95k times
- total # pixels drawn: Cocoa 29.5 million vs. DirtyRectHelper 30 million pixels
(Pixels drawn is sum of width x height of each rect in getRectsBeingDrawn)
- interesting metric "CPU instructions per pixel". Cocoa averages 350 vs. DirtyRectHelper 200 for this test case.
For comparison, in a test case NSOpenGLView draws 30fps, it takes ~40 instructions per pixel, with 93% of it spent in Cocoa convertRect and such, 5% in
glClear, and 2% in glCallList(displaylist);
On Feb 28, 2012, at 12:21 AM, Quincey Morris wrote:
Indeed. Especially as it's usually taken as a given that large numbers of NSViews aren't going to perform well. Trying to make the scenario work might be quixotic at best.
However, two thoughts did spring to mind:
1. It might be easier (and faster) to just hide the subviews as they scroll out of the visibleRect, and show them as they scroll in. I'd *imagine* that lots of hidden views wouldn't be a big problem for drawing.
Ok, Quincey these are the kind of suggestions I'm looking for. Worth exploring.
2. If the views are arranged linearly, which seems like one possible interpretation of the OP's scenario description, then it might be possible to use the new view-based NSTableView, since that configuration basically caches a small number of actual "cell" views even if the total number of cells is huge.
Ooh. I hadn't explored the entire 10.7 SDK yet because I still need to support our installed base, many of which still use 10.6. Can a single NSTableCellView cell have a view hierarchy?
Even if not, this approach will certainly make NSTableViews more useful and customizable.
On Feb 27, 2012, at 10:46 PM, Graham Cox wrote:
In a scroll view, views that are fully clipped out by the clip view, and those which do not intersect the dirty rects, are simply not drawn. At all. No drawing is faster than any other mechanism for doing drawing.
-getRectsBeingDrawn: isn't slow itself, but what you're doing with those rects probably is. Drawing tens of thousands of objects efficiently is possible (though if those objects are actual NSViews, all bets may be off). Basically, once you're over a thousand or so, you'll need to do some sort of spatial hashing to efficiently determine which of a set of objects needs to be drawn. If you are iterating a list linearly, you'll get killed by that at around the thousand or so mark.
like Cocoa, my implementation only draws visible dirty rects given to me by viewWillDraw:, and eliminates most of the overhead of clipping, lockFocus.
// This equates to ~ 20% CPU usage when redrawing 30 times per second.
// unavoidable 1.5 ms spent in drawRect: (images/strings/NSFillRect)
// unavoidable 1.2 ms spent in viewWillDraw
// avoidable! 4.3 ms spent clipping, lockFocus, supporting wacky features
Have you simply tried adding all the controls to the document view and testing to see if, in fact, drawing is too slow?
Ok, I did try this and noticed that Cocoa convertRect calls did not increase exponentially, which is promising.
Again, drawing is fast. convertRect/visibleRect/lockFocus is too slow.
All my custom NSControls are now drawn with CGLayerRefs, except for the gorgeous Quartz NSAttributedStrings on top.
creds to bghudappkit for providing a jumpstart in this direction (if interested in high performance mods to bghudappkit, let me know)
The slow men on the totem pole drawing wise are ~30fps of image and NSAttributedString drawing.
Another approach to get more speed with many subviews is to use Core Animation. You still have all the views there for hit-testing and event handling, but layers are used to draw them, which should be composited more quickly than classic view drawing.
tried that. Core Animation layer-backed views made things run slower.
So, one question I have (which may not be Cocoa related?), is whether there's a way to minimize the impact my app has on a
on separate a separate CPU->GPU intensive app?
In other words, instead of being CPU bound, my pair of apps is bus bound, or video-card bound.
OpenGL Driver Monitor shows a significant increase in 'CPU wait for GPU' even when my app is drawing only a small amount.
Also, when one OpenGL window partially obscures another OpenGL window, performance *really* drops. Not surprising, but bad for me.
~Keith
_______________________________________________
Cocoa-dev mailing list (email@hidden)
Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden