Howdy,
This topic has come up on a few occasions before, but I have yet to determine a reliable method for determining where my output signal will appear in a recording using CoreAudio on OS X.
Here's what I'm doing:
1. I start an AUHAL unit for INPUT, recording to an audio file. 2. In the first call to the INPUT IOProc for the AUHAL, I store the inTimestamp value. 3. I start an AUHAL unit for OUTPUT, sending a pulse train over the wire with the first pulse at sample index 0. 4. In the first call to the OUTPUT AUHAL, I store the inTimestamp value. 5. The signal reaches the end of playback, and I calculate the delay before shutting the two AUHAL units down.
What I need to figure out, is where my first pulse will appear in the INPUT audio file (which happens to be a CAF audio file with PCM data in the native data format).
Below is how I attempted to determine this, where recordStartTimeStamp and playbackStartTimeStamp are what you'd expect, and I take some liberties in syntax to get my point across:
// Offset the start timestamp by the presentation latency for the playback hardware, and convert the sample time to host time using the playback device. playbackStartTimeStamp.mSampleTime += playbackDevice.kAudioDevicePropertyLatency.outputDirection; playbackStartTimeStamp.mFlags = kAudioTimeStampSampleTimeValid | kAudioTimeStampRateScalarValid; playbackStartTimeStamp = translate_to_host_time( playbackDevice, playbackStartTimeStamp );
// Convert the host timestamp to sample time on the recording device playbackStartTimeStamp.mFlags = kAudioTimeStampHostTimeValid | kAudioTimeStampRateScalarValid; playbackStartTimeStamp = translate_to_sample_time( recordingDevice, playbackStartTimeStamp ); // Convert the host time to sample time, using the record device's clock
// Offset the start timestamp again by the latency for the recording hardware playbackStartTimeStamp.mSampleTime += recordingDevice.kAudioDevicePropertyLatency.inputDirection;
Here are some notes on the above:
* I do factor in stream latencies in the calculations in my code. They are all zero, however. * The AUHALs are vending HAL timestamps by setting kAudioOutputUnitProperty_StartTimestampsAtZero to false * I assert the HostTimeValid | RateScalarValid above For illustration's sake. This is asserted in the translation methods. * I have previously tried also using kAudioDevicePropertySafetyOffset in my calculations, but the calculated delay was far greater than reality. I presume AUHAL's stamps are already accounting for this. * The audio devices are still running at the time of conversion.
Now the first pulse should be ( playbackStartTimeStamp.mSampleTime - recordStartTimeStamp.mSampleTime ) samples into the recording, correct?
Unfortunately, I am not finding this to be the case when using a single device for both input and output. With one device (a 1st gen MOTU Traveler) the pulse is another 2 samples away. With another device (RME Fireface UFX) I am noticing the pulse is quite a bit earlier (39 samples) than the calculated delay as above.
So here are my questions:
1. In the past, it was suggested that the kAudioDevicePropertySafetyOffset was to also be included in latency calculations, but that was for direct HAL coding, and not AUHAL. Is the safety offset to be ignored in AUHAL timestamps? One message seems to indicate this is the case:
But another message indicates otherwise:
So what's the final word on AUHAL timestamp calculations/meanings? I understand the timestamp parameter in the AUHAL IOProc to mean "the time a sample will hit the device" in the output context, and "the time a sample arrived at the device" in the input context. Thus, presentation latency is representing the transit time to get that sample out to (or in from) the wire.
2. When converting to/from host time using AudioDeviceTranslateTime, is it better to use a cleared-out AudioTimeStamp struct, or is it OK that I am simply setting the flags on a pre-filled struct?
3. Is my math correct on the delays? I'll admit that it's quite an exercise to wrap your head around these values when timestamps appear to occur in the past on the record side, and in the future on the playback side...
3. Is my goal even attainable? Is it reliable to take the reported values from the drivers and use them to determine the transit time of a signal from the output of one device to the input side of another device? I'm hopeful this is the case, because my own loopback testing is off by a constant number of samples using my calculation above, and I do recall a message where someone (Bill or Jeff -- can't remember) indicated that the system is deterministic in this sense.
Thanks in advance,
Chris Liscio |