Re: How does Core Audio handle heterogeneous sample rates?
Re: How does Core Audio handle heterogeneous sample rates?
- Subject: Re: How does Core Audio handle heterogeneous sample rates?
- From: Brian Willoughby <email@hidden>
- Date: Sun, 5 Nov 2017 18:25:42 -0800
Most applications, including iTunes, only send audio to a single device.
Therefore, only one sample rate is handled. However, if that application mixes
multiple audio files at different sample rates, then there will be sample rate
conversion. It is up to the application developer to decide on the type of
sample rate conversion and, while CoreAudio provides many options, there’s no
rule that prevents a developer from using non-CoreAudio solutions. In addition,
some applications have problems when a single audio file’s sample rate does not
match the device sample rate, even in those situations where there isn’t a
second sample rate on input.
The number of applications which can open multiple audio devices is small
compared to the overall macOS market, but if you’re in a modern recording
studio then you might actually see quite a few applications that support this.
Companies like Mark of the Unicorn (MOTU) have developed their own features to
support multiple audio devices. Their Digital Performer (DP) product allows
users to connect input and output from multiple pieces of audio hardware. MOTU
DP handles all of the sample rate conversion within the application, although
it is unclear (to me, at least) to what degree they leverage CoreAudio SRC
versus their own SRC, and whether they support audio interfaces besides their
own brand (they probably do, thanks to CoreAudio drivers) or not.
Independent of applications, macOS supports aggregate devices within the
CoreAudio subsystem. These Aggregate Devices are created and configured by the
user and then presented transparently to the application as if they were a
single, physical device. In other words, the application code is unaware that
the aggregate device is not a single hardware device, and thus there is no
special code within the app. The answer to your questions in this case is that
a CoreAudio aggregate device has a master audio device which does not incur
sample rate conversion, while all other audio devices that belong to the
aggregate have some amount of automatic SRC, as needed. There are sometimes
issues when the user incorrectly or incompletely configures an aggregate
device, so it’s important to be aware that it’s not a fool-proof option.
CoreAudio itself offers a number of building blocks to application developers,
many of which have automatic SRC. The default output audio device AudioUnit and
the HAL output device AudioUnit both have optional SRC support. There is an
AudioStreamBasicDescription (ASBD) for both input and output scopes of those
AudioUnits, and if the sample rate does not match then SRC will be performed.
The ExtAudioFile API allows nearly all audio file formats to be opened and
decoded, and again there is both an input and output ASBD such that SRC may be
performed, as needed. In addition to these handy tools for audio device I/O and
audio file I/O, CoreAudio also offers the same SRC feature as AUConverter, an
independent AudioUnit that can convert between sample rates and/or sample
formats. This AudioUnit allows control of the SRC quality, including some of
the best SRC in the industry (see http://http://src.infinitewave.ca for
details). When directly hosting an AUConverter, application developers have the
ability to control the level of quality, and thus control the tradeoff between
CPU usage and audio quality. Although AUConverter is used by CoreAudio in many
places (default output device, ExtAudioFile, aggregate devices), it is not
always possible for the user to control the quality in an immediately
accessible fashion. In most of those situations, the SRC quality is a bit lower
than best-in-the-industry, probably because Apple's assumption is that most
macOS users care more about CPU availability than ultimate audio quality. In
response, there is a small market for carefully-coded macOS applications that
specifically set the SRC quality to maximum, but it is rather difficult to
discern what is happening behind the scenes for any given app.
The short answer is that CoreAudio does not directly handle producers with
different sample rates. Instead, the application developer and/or the user must
configure various options available in CoreAudio to control this. As far as I
know, there is no completely automatic handling, because someone (either
developer or user) must choose the mechanism that controls this.
As for your challenge to pick a rate that won’t incur resampling, it’s up to
you as a developer to design the graph within your application that handles
audio. Whether you use the CoreAudio AUGraph API or build your own audio flow,
it’s still up to your code to query the outside sample rates (such as the
output device or the input file) and decide how to manage mismatches. You could
query everything and then choose the highest sample rate to preserve maximum
quality, but that would use more CPU. You could also choose the lowest sample
rate and suffer the loss of quality. When the sample rates are fairly close,
such as 44.1 and 48 kHz, there might not be much quality difference, but your
app will still need a reasonable amount of intelligence to automatically choose
the overall sample rate that involves the least amount of SRC (if your goal is
to avoid SRC as much as possible). CoreAudio does not make any of these
decisions automatically for you. There are default sample rates, but they are
just defaults, not optimizations based on all of your application’s resources.
As for your linear phase question, it appears that all of Apple’s SRC options
are linear phase. Look at the src.infinitewave.ca site for Afconvert, Apple
CoreAudio, and Logic. You can examine many aspects of the SRC performance. Note
that Afconvert (bats) is the highest quality available from Apple, although the
test results on that site may represent older macOS releases than the one
you’re working on.
A little more information on what you’re trying to build, and whether you’re on
macOS or iOS would help narrow down the many possibilities.
Finally, I don’t think you’re rambling. I’ve put a lot of time into developing
tools that allow me to verify the bit-perfect accuracy of various macOS
software. I do music mastering, and prefer to ensure that there is no SRC in my
monitoring path unless it is part of the mastering decision process. It can be
difficult to be certain, but it is still quite possible to confirm that SRC is
not occurring with specific software and hardware setups.
Brian Willoughby
Sound Consulting
On Nov 5, 2017, at 5:18 PM, Brian Armstrong <email@hidden>
wrote:
> I was curious how Core Audio handles producers with different sample rates,
> let's say 44,100 Hz and 48,000 Hz. I guess this may also be a question about
> sound cards. My suspicion is that one consumer must have to face resampling
> on the way out. If that's the case, I'm curious what the resampling looks
> like, and if it has linear phase.
>
> I'm trying to nail down some quirks that might be sample rate related in a
> modem library I work on. If there is resampling that occurs, seems my best
> bet is to pick a rate that won't incur resampling, whatever that rate might
> be? But I'm not sure how I would get that rate.
>
> Sorry for the rambling here :)
> -Brian
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Coreaudio-api mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden