Re: Poor performance on software rendering generator
Re: Poor performance on software rendering generator
- Subject: Re: Poor performance on software rendering generator
- From: Pierre Jasmin <email@hidden>
- Date: Mon, 08 Aug 2011 16:07:53 -0700
- Organization: RE:Vision Effects
So here's what happens. If your plug-in says it can render
using hardware and Motion is rendering hardware, everything
stays on the GPU. Likewise, if your plug-in says it can render
software and Motion is rendering in software (this only
happens when inside a Motion template that's background
rendering for FCPX, I believe), everything stays in software.
If your plug-in says it can only render in one or the other,
and the app is using the opposite method, we have to move
between the GPU and CPU. This should all be working correctly
for the simple case of a single generator in the timeline.
* Good that motion tools can do CPU and GPU, this would mean in
theory that FCP-X could request motion engine (or eventually native
implementation) to execute render on CPU or allow one to flip what
is foreground and maybe provide a UI option for that. The FCP-X
render CPU in bg and GPU in fg has some value but co-processing
could be more powerful/useful. For example, if we could when we ask
to get Bitmap force execution on a full CPU render thread (down to
get frame from disk) we would already be closer to something.
There are probably different ways to skin such cat and we would be
happy to review this with whoever implements support for this in
FCP-X side. I understand dot release is first get this template
indirection working (nothing works for us even as a technical
delivery as we don't have any FxPlug tool that use only the main
input at current time right now). For now, consider a tool like our
DE:Noise that uses 3 frames -1,current,1) at each frame (if it
forces 3 full float buffer to RAM conversion per frame, that can
already be 3 GB/s of transfer at 1080 30P just to get the images,
assuming it does not trigger useless "we do no caching here" fetch
frame back from disk again on top) . Alternatively it was even just
possible to execute on FCP-X background render process and lie to
Motion so only the render out has to be copied back to the graphics
(and even if need be declare ourselves generators or something to
avoid any GPU action prior to us), all this assuming that other
parts don't want to play as well, for example AVFoundation or
something when decoding ends up not having the image on the GPU
etc...
For the more complex case where there are multiple effects
some of which are CPU-only and some of which are GPU-only, we
very likely do the transfer more often than we need to. For
example, if the app is rendering on the GPU and you have a
GPU-only generator with 2 CPU-only filters, we'll render the
generator on the GPU, then download the output to the CPU,
apply the first filter, upload the result to the GPU, then
turn right around and download the result back to the CPU to
apply the next filter, and then upload the result to the GPU.
This is obviously not ideal. I hope we can fix it in the
future. Please file a bug if you'd like to see it fixed, too.
The obvious thing you can do to help your own product is to
make your filter or generator work in both scenarios. It's
more work for you, but it's optimal for your users regardless
of what changes we make in the future.
Thanks!
Darrin
|
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Pro-apps-dev mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden