Lists

Terms and Conditions
Lists hosted on this site
Email the Postmaster
Tips for posting to public mailing lists

Reduced storage for speech quality audio

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Reduced storage for speech quality audio

Subject: Reduced storage for speech quality audio
From: Brian Willoughby <email@hidden>
Date: Fri, 1 Feb 2008 06:29:45 -0800

Hi Paul,

I took the liberty of changing the subject, since your question deviates somewhat from the original questions. You deserve your own thread here. ;-)

You could save a great deal of storage by using the AudioConverter API to reduce the audio sample rate as low as 16-bit / 8 kHz if you are happy with telephone quality speech. You would also cut your memory requirements in half by storing monophonic speech instead of stereo.

During playback, you would simply allow the Default Output Unit to automatically handle the format conversion from your reduced-quality storage format to whatever the device is running at. In this design, your storage format is completely independent of the hardware sample rate.

To further reduce storage requirements, you could even use the compressed format options of AudioConverter if psychoacoustic techniques are not objectionable. In this case, you might even be able to maintain better overall frequency response than telephone quality speech while taking less memory.

So long as you correctly specify the 16/44.1k source, and your intermediate storage format when creating your original AudioConverter and the Default Output Unit, then CoreAudio will handle this translation on the fly.

I have a feeling that your users will not be concerned with the quality of SRC, even with two conversions.

Another advantage of this design is that you do not have to take over the hardware to minimize your storage space, and so your application plays nicely with others. Think about it: iTunes regularly reduces the storage requirements of audio CDs and plays them back without taking over the hardware settings, so why should your application need to be any more heavy-handed, especially considering that your quality requirements are lower than iTunes.

Brian Willoughby
Sound Consulting

On Feb 1, 2008, at 04:38, Paul Fredlein wrote: My app is designed for foreign language learning at high school. As most schools prefer to run the software from the CD rather than installing it on the hard disk I read all the audio, including that recorded by the student, for each 'page' into RAM (- cause the CD spins down) .

As it is speech there is no point in the audio being 44100 @ 24bit so 22050 @ 16bit is fine but if the hardware defaults to a high quality, such as 'Builtin audio' then I'm filling up buffers unnecessarily. I would much prefer to set the hardware to what I want at runtime, as I do on Windows - and return it to was it was, but it seems that this is undesirable on OSX - any suggestions? Teachers don't want students wasting class time fiddling with audio settings.

I suppose for each page there would be a max of 20meg of audio in RAM - I know it's not much today, it was 10 years ago, but it just seems a waste of resources.

Thanks,

Paul

_______________________________________________
Do not post admin requests to the list. They will be ignored.
Coreaudio-api mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden



References:  
  >Re: Quality of CoreAudio SRC (From: Paul Fredlein <email@hidden>)




Prev by Date:
Re: Quality of CoreAudio SRC

Next by Date:
Re: Quality of CoreAudio SRC

Previous by thread:
Re: Quality of CoreAudio SRC

Next by thread:
Re: Quality of CoreAudio SRC

Index(es):

Date
Thread