Voice Processing echo cancellation operates at 22kHz?
Voice Processing echo cancellation operates at 22kHz?
- Subject: Voice Processing echo cancellation operates at 22kHz?
- From: Miriam Zimmerman via Coreaudio-api <email@hidden>
- Date: Thu, 14 Nov 2024 10:20:53 -0800
(Apologies if this posts twice; I sent it yesterday from the wrong address
and I'm not sure if it went through -- I don't see it on the list archives
yet, anyway.)
I have been chasing a bug that affects (at a minimum) both Firefox (and
Firefox's audio processing library) as well as Safari.
Namely, when using a voice processing IO unit, with bypass off, the echo
cancellation mode appears to operate at 24kHz (despite a requested rate of
44.1kHz or 48kHz), which means that any audio above around 12kHz must get
dropped to avoid aliasing.
This results in a distinctly "thin" sounding audio, due to the missing
upper harmonics.
To reproduce:
1. install a virtual audio device like BlackHole (or similar)
2. open Safari to
https://webrtc.github.io/samples/src/content/devices/input-output/
3. Grant microphone permissions, and set the input device to either the
built-in microphone or a 3.5mm headset
4. Set the device default output to blackhole in system settings
5. Start recording from the blackhole device in Audacity (or similar)
6. Speak into the microphone for sufficient time to generate a
meaningful sample, or play a prerecorded voice clip
7. Stop recording
8. Analyze the spectrum
In my tests, this results in a spectrum like this:
[image: cut off spectrum.jpg]
While recording the same input track directly from the input device in
Audacity results in a much more full spectrum:
[image: full spectrum.jpg]
The reason I suspect echo cancellation specifically is that the system logs
in Console.app show the following when I run a minimal test app to
reproduce the issue:
default 14:46:56.143925-0800 test-coreaudio-rs
[vp::vx::Voice_Processor:0x13d009600]
(UL) content input 'mic' format is 1 ch, 44100 Hz, Float32
default 14:46:56.143963-0800 test-coreaudio-rs
[vp::vx::Voice_Processor:0x13d009600]
(UL) echo I/O block size is 240
default 14:46:56.144001-0800 test-coreaudio-rs
[vp::vx::Voice_Processor:0x13d009600]
(UL) echo I/O sample rate is 24000.000000
default 14:46:56.144043-0800 test-coreaudio-rs
[vp::vx::Voice_Processor:0x13d009600]
(UL) echo configuration is
{".austrip":["Generic/V1/VPVX/unkn-unkn-misc-ulnk.austrip"],".dspg":"DSP/UL/uplink_echo.dspg",".propstrip":["DSP/UL/uplink_echo.propstrip"],".propstrip
(override)":null}
default 14:46:56.144078-0800 test-coreaudio-rs
[vp::vx::Voice_Processor:0x13d009600]
(UL) echo input 'mic' format is 3 ch, 24000 Hz, Float32,
deinterleaved
default 14:46:56.144112-0800 test-coreaudio-rs
[vp::vx::Voice_Processor:0x13d009600]
(UL) echo input 'mic_clip_data' format is 3 ch, 24000 Hz, Float32,
deinterleaved
default 14:46:56.144145-0800 test-coreaudio-rs
[vp::vx::Voice_Processor:0x13d009600]
(UL) echo input 'ref' format is 2 ch, 24000 Hz, Float32,
deinterleaved
default 14:46:56.144179-0800 test-coreaudio-rs
[vp::vx::Voice_Processor:0x13d009600]
(UL) echo output 'mic' format is 1 ch, 24000 Hz, Float32
Further discussion here:
https://github.com/mozilla/cubeb-coreaudio-rs/issues/242
I am wondering if there is any programmatic way to configure the AEC sample
rate or, failing that, if this is a bug?
If there are any apple audio engineers on the list, I would appreciate your
thoughts here, too.
Thanks!
Miriam Zimmerman
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Coreaudio-api mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden