Mailing Lists: Apple Mailing Lists

Image of Mac OS face in stamp
 
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Poor performances when using hierarchies of renderers



Hi,

We have set up a rendering framework based on Quartz Composer where we use the image output generated by some compositions as inputs to other compositions ,etc...
We have one object, let's call it ImageProducer, that maintains a list of QCRenders and asks them to render their composition in a specific OpenGL context. Then we arrach a CVOpenGLBufferRef (taken from a pool, as in the Performer example) to the GLContext, call glFlush and get our renderer image. This image, along with some others calculated in parallel are then sent as inputs to the QCRenderers of other ImageProducers, and so on.
We were hoping that dealing with CVOpenGLBufferRef would keep all images in the Graphics Card memory and in order to get optimal performances.


First results are quite promising with only a few producers, but when we increase their number , we get a sudden drop of performances from 60fps to 5fps.

I am asking for some hints here on how to tackle performance measurements in such a situation. Namely:
- how would you monitor what gets copied between the main memory and the graphics card ?
- how would you interpret the initial result given by Shark that I list below:


47.4% 47.4% mach_kernel ml_set_interrupts_enabled
0.0% 46.5% mach_kernel thread_block_reason
0.0% 46.5% mach_kernel thread_block
0.0% 46.5% mach_kernel clock_delay_until
0.0% 46.5% mach_kernel delay_for_interval
0.0% 46.5% mach_kernel IODelay
0.0% 46.5% com.apple.ATIRadeonX1000 ATIRadeonX1000::waitForTimeStamp (unsigned long)
0.0% 46.5% com.apple.ATIRadeonX1000 IOATIR500GLContext::clientMemoryForType(unsigned long, unsigned long*, IOMemoryDescriptor**)
0.0% 46.5% mach_kernel IOUserClient::mapClientMemory(unsigned long, task*, unsigned long, unsigned)
0.0% 46.5% mach_kernel is_io_connect_map_memory
0.0% 46.5% mach_kernel iokit_server_routine
0.0% 46.5% mach_kernel ipc_kobject_server
0.0% 46.5% mach_kernel mach_msg_overwrite_trap
0.0% 46.5% mach_kernel thread_set_user_ldt


It looks like a lot of time is spent waiting for a TimeStamp from the card driver. What does it mean ?

Of course, I am also interested in any thoughts you might have regarding the problem.
- each ImageProducer has its own OpenGL context in which it renders its child compositions. I can't really see how to avoid that. Is there a problem with having many offline contexts being used at tghe same time (please note that our rendering is single-threaded).
- each ImageProducer has its own CVOpenGLBufferPool...maybe tha's bad...is it ?
- anything else.


We know that we should get much better performances because what we display using multiple compositions in a hierarchy can be done with a single composition and it is very fast; So there is something about the way we use OpenGL that is wrong.

Any help and guidance would be very appreciated.


Best regards,

Matthieu
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Quartzcomposer-dev mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/quartzcomposer-dev/email@hidden

This email sent to email@hidden


Visit the Apple Store online or at retail locations.
1-800-MY-APPLE

Contact Apple | Terms of Use | Privacy Policy

Copyright © 2007 Apple Inc. All rights reserved.