Re: Preroll semantics & clarification of kAudioUnitProperty_OfflineRender
Re: Preroll semantics & clarification of kAudioUnitProperty_OfflineRender
- Subject: Re: Preroll semantics & clarification of kAudioUnitProperty_OfflineRender
- From: Heinrich Fink <email@hidden>
- Date: Fri, 25 Nov 2011 16:33:17 +0100
Hi Brian, Hi Paul,
> On Nov 24, 2011, at 01:48 , Brian Willoughby wrote:
> B) You assume that a non-real-time system can be turned into a real-time system with a mere 2-second pre-roll.
> On Nov 23, 2011, at 14:39 , Paul Davis wrote:
> if you have a design that can't run in realtime,
> then it can't run in realtime no matter how much preroll or other
> buffering you do. If you have a system that can run in realtime, then
> the amount of buffering you need depends on a variety of things.
Of course you are right: my statement that a preroll buffer enables a system incapable of running in real time to be working under real-time conditions is not correct. Thank you both for this correction.
But let me explain what my actual point was:
> T = A*nframes + B + C
>
> where C is non-constant and not entirely under application control.
Using the above formulation I just meant that a larger preroll buffer could help a system to operate in real time where "C" is having stronger varyings over time. As I understand, AudioUnits usually try to keep their "C" within tighter limits (avoid allocating memory / locking, etc..). Therefore, I concluded in my previous email, that an additional preroll buffer at the output might present an unnecessary burden in my design. From a global point of view I assume either way that my system is capable of satisfying the general real-time constraint.
In other words, I would consider AudioUnits to operate properly under "short-term-real-time" characteristics: each render call must not take longer than the realtime duration of the requested samples. Other more unstable systems might only meet the real-time constraint over a longer period of time (e.g. during a 2 sec preroll buffer).
> On Nov 24, 2011, at 01:48 , Brian Willoughby wrote:
>
> Even a system capable of faster-than-real-time playback might briefly experience a hitch in data flow, and that's where the pre-roll helps. But every time a part of the 2-second pre-roll is used to make up for samples that aren't ready yet, there is the requirement that the system run faster than real-time long enough to re-fill the pre-roll buffer, otherwise things eventually fall apart.
This is exactly how the preroll strategy is currently realized for the video output of our playout software (prefill & catch up if necessary). As a sidenote: My job is to add an audio engine to a TV broadcasting application with an existing video processing pipeline.
My concern is to basically decide between the following two approaches:
A] Couple audio processing and output with the existing video processing pipline.
B] Use a separate path for audio processing and output.
Both scenarios should use audio unit graphs as the underlying audio rendering path.
The first scenario (A) has the advantage of using the same API for video+audio output (for example the native SDK of a video broadcasting device). That makes synchronization and scheduling a bit easier. On the other hand, the current strategy of filling the preroll buffer with audio data results in all these weird cases that we have discussed in the previous posts of this thread: Strongly varying intervals between render calls, possible audio glitches while catching up with the preroll buffer and so forth. Of course one could add yet another buffer to further decouple audio rendering from filling the preroll buffer, but this would further increase latency and make things even more complicated.
The second scenario (B) would directly use the AUHal as the audio output path. This approach would definitely require more careful thinking about audio/video synchronization since their output paths are now using two different APIs. However, switching between faster-than-real-time and real-time rendering on the output side would be avoided.
Both approaches meet the realtime constraint in general. From a more general point of view, I guess the question is just at which end of the pipeline you are implementing safety buffers. It seems that audio graphs rather follow the strategy of buffering as little as possible at the output side, and to install additional safety buffers only where necessary (e.g. any audio unit that reads from disk needs to have its own buffering as well). The other strategy would be to put a larger buffer at the output of your pipeline. I this case you can't really isolate the parts of the pipeline that present a risk to the (short term) realtime constraint, so it's safer to just buffer everything at the output. One could argue that the latter approach is symptomatic for a less predictable or less careful designed system. In theory you should of course always buffer only when and where necessary.
> Why not just instruct your DJs to start the pre-roll process more than 2 seconds before air time?
I thought of that, too. But some user actions should have almost instantaneous consequences in the output signal. Think for example of assembling the program for a TV channel in live situations, e.g. you modify a parameter in an overlay graphic. With a not-faster-than-realtime preroll buffer the consequence of this parameter change would not be visible until 2 seconds later. On the other, when live input will be necessary, a faster-than-realtime feed of course just isn't possible. Then you would have no choice other than delaying the output for the real time duration of the preroll buffer.
> I believe that if you adjust your expectations for reality, then you might have an easier time implementing your code.
Yes, I think that my current conclusion of decoupling audio rendering (and in consequence audio output as well) from the current video path is a realistic and reasonable approach. Adapting audio rendering to the video buffer strategy just isn't worth the additional risks of bad behavior within the audio graph. I am positive that a well behaving audio graph that plays audio through the AUHal with its slimmer buffering provides a sufficient quality of service for our scenario. Would you agree with my conclusions?
best regards,
Heinrich Fink
p.s.: I really enjoy this discussion! Your comments make me think about many important aspects I might have missed without your help.
Attachment:
smime.p7s
Description: S/MIME cryptographic signature
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Coreaudio-api mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden