Re: Working with Audio
Re: Working with Audio
- Subject: Re: Working with Audio
- From: Todd Heberlein <email@hidden>
- Date: Tue, 26 Mar 2002 10:35:36 -0800
On 3/25/02 3:43 PM, "Matthew Johnson" <email@hidden> wrote:
>
In particular I cannot find any samples
>
of how I should be recording audio data
>
and playing back audio.
I apologize to everyone for such a long response.
I too have had problems finding good *cocoa* examples of using Apple's
CoreAudio architecture, and the documentation that I have found is fairly
limited. The best I could find is a PDF document called coreaudio.pdf
(found on Apple's web site) containing a document titled "Audio and MIDI on
MacOS X". It is listed as "Preliminary Documentation" and is roughly a year
old.
For source code, the best example I could find is a simple C program that
comes with Apple's developer tools. The program uses some of the CorAudio
API to query the hardware on your machine (microphone, speakers, etc.) and
prints out details about them such as the data types and sampling rates. It
will give you a good idea about how to use some of the APIs. Look on your
hard disk in
/Developer/Examples/CoreAudio/HAL/DisplayHALDeviceInfo
I also found a program called CASoundLab2 on Apple's web site. It was
useful too.
I still haven't quite figured auido out, but I have been able to record
audio from my USB microphone into a buffer and play it back to the speakers.
My approach may be completely dead wrong, so I hope an Apple engineer can
join the conversation and give a better explanation.
In short: Apple's CoreAudio uses what are effectively "callback" functions
to record from an input device (i.e., microphone) or play it back to an
output device (i.e., speakers). You register the callback functions with
AudioDeviceAddIOProc() and you start and stop a thread that executes the
callback with the AudioDeviceStart() and AudioDeviceStop() methods.
The functions seem to run in their own thread outside Cocoa's main event
loop. I don't know the proper synchronization mechanisms to use (e.g., an
NSLock or pthread_mutex or what) between cocoa main loop and these audio
callback functions. Perhaps someone at Apple can chime in here.
The CoreAudio architecture is much richer than what I have just described.
For example, you can connect nodes (i.e., "code blocks") together in graphs
(computer science graphs, with nodes and edges, not a visual graph) to route
sound through multiple components. However, I am not close to figuring that
out.
Below is some code I wrote that is rather specific to my system. For
example, I already know the data format, channel number, and sampling rates
for my input (USB mic) and output (apple speakers), so I did not query for
those. However, the code may still be of some help to you.
The program shows a simple window with two buttons: Record and Play.
MyController object has pointers to these buttons so I can change their
states (e.g., disable the play button while I am recording). The buttons
also toggle between the primary text and the alternative text (Record/Stop,
and Play/Stop). The alternative text was entered in IB.
MyController.h:
@interface MyController : NSObject
{
IBOutlet id play_button;
IBOutlet id record_button;
}
- init;
- (void)awakeFromNib;
- (IBAction)pressPlay:(id)sender;
- (IBAction)pressRecord:(id)sender;
@end
MyController.m:
#import "MyController.h"
#import <CoreAudio/CoreAudio.h>
static AudioDeviceID Input_device;
static AudioDeviceID Output_device;
static float* Pcms = NULL;
static int Max_units = 5000000; // 5 million PCM samples
static int Units_recorded = 0;
static int Units_played = 0;
/************************************************************************
* FUNCTION: recordCallback
* INPUT:
* device - the AudioDevice object (in this case, the microphone).
* now - timestamp used for sychronization.
* input_data - data from the microphone.
* input_time - time when data was recorded.
* output_data - output data, unused in the function.
* output_timestamp - time when data will be played; unused.
* client_data - user supplied data, unused in this function.
* OUTPUT:
* always kAudioHardwareNoError in this function.
* PURPOSE:
* recordCallback is called at a regular intervals by an OS
* provided thread. It passes data from the microphone to us.
* At this time we just save that data in the Pcms array until
* the array fills up.
*
* This code is really specific to my computer system, because
* I have already tested the data sampling rate for my
* microphone (Manufacturer: AKM, Name: AK5370) which is
* 44100.000 (same as CD quality sound). The format is
* linear pulse code modulation (linear pcm, or lpcm), with
* each sample a 32 bit floating point number (much higher than
* CD quality). It only has one channel, so I don't have to
* worry about interleaved data.
*
* The relevant structures in the AudioBufferList object are:
* o input_data->mNumberBuffers - number of buffers (1)
* o input_data->mBuffers[0].mDataByteSize - size of first buffer
* o input_data->mBuffers[0].mData - pointer to buffer
************************************************************************/
OSStatus recordCallback(
AudioDeviceID device,
const AudioTimeStamp* now,
const AudioBufferList* input_data,
const AudioTimeStamp* input_time,
AudioBufferList* output_data,
const AudioTimeStamp* output_time,
void* client_data
)
{
int num_of_units = 0;
int i = 0;
float* new_data = NULL;
// NSLog(@"record");
new_data = (float*)(input_data->mBuffers[0].mData);
num_of_units = input_data->mBuffers[0].mDataByteSize / sizeof(float);
for (i = 0; (i < num_of_units) && (Units_recorded < Max_units); i++) {
Pcms[Units_recorded] = new_data[i];
Units_recorded += 1;
}
return kAudioHardwareNoError;
}
/************************************************************************
* FUNCTION: playCallback
* INPUT:
* device - the AudioDevice object (in this case, the Apple speakers).
* now - timestamp used for sychronization.
* input_data - input data, unused in this function.
* input_time - time when data was recorded, unused in this function.
* output_data - buffer to which we will write the sound data.
* output_timestamp - time when data will be played.
* client_data - user supplied data, unused in this function.
* OUTPUT:
* always kAudioHardwareNoError in this function.
* PURPOSE:
* playCallback is called at regular intervals by an OS provided
* thread. It passes in a buffer that we fill up with data
* that will be sent to the speakers. In this case, we send
* it data from the Pcms buffer.
*
* This function is based on my hardware configuration, which
* is a dual channel output (stereo) with a sample rate of
* 44100.000. The expected format of the data is linear pule
* code modulation (linear, PCM, or lpcm), which happens to be
* the same format the microphone used and is recorded into
* out Pcms array.
*
* The only potentially complicating factor is that the recorded
* data in one-channel, while the playback data is dual channel.
* Because the Apple system assumes the channels are interleaved
* into the same stream, each sample point from the recorded
* array has to be written to the output buffer twice: once for
* the left speaker and once for the right speaker.
*
* The relevant structures in the AudioBufferList object are:
* o output_data->mNumberBuffers - number of buffers (1)
* o output_data->mBuffers[0].mDataByteSize - size of first buffer
* o output_data->mBuffers[0].mData - pointer to buffer
************************************************************************/
OSStatus playCallback(
AudioDeviceID device,
const AudioTimeStamp* now,
const AudioBufferList* input_data,
const AudioTimeStamp* input_time,
AudioBufferList* output_data,
const AudioTimeStamp* output_time,
void* client_data
)
{
float* dst = NULL; // destination for the data
int num_of_units = 0; // number of sample points buffer can take
int num_to_play = 0; // number of sample units to play
int num_of_channels = 2; // number of channels output supports
int i = 0; // index
int j = 0; // index into destination buffer.
// NSLog(@"Play");
dst = (float*)(output_data->mBuffers[0].mData);
num_of_units = output_data->mBuffers[0].mDataByteSize / sizeof(float);
num_to_play = num_of_units / num_of_channels;
for (i = 0; i < num_to_play; i++) {
if (Units_played < Units_recorded) {
dst[j++] = Pcms[Units_played]; // recorded stuff
dst[j++] = Pcms[Units_played];
Units_played += 1;
}
else {
dst[j++] = 0.0; // silence
dst[j++] = 0.0;
}
}
return kAudioHardwareNoError;
}
@implementation MyController
/************************************************************************
* METHOD: init
* INPUT:
* OUTPUT:
* self
* PURPOSE:
* init(), called when object is first created, is used to
* initialize our CorAudio objects. In particular, we identify
* the default input (microphone) and output (stereo speakers)
* for our system, and we register to callback functions for
* these input and output objects. The callback functions won't
* start executing until later when we call AudioDeviceStart().
* They are stopped with AudioDeviceStop().
************************************************************************/
- init
{
OSStatus err;
UInt32 property_size;
if (self = [super init]) {
// do local initialization
property_size = sizeof(AudioDeviceID);
err = AudioHardwareGetProperty(
kAudioHardwarePropertyDefaultInputDevice,
&property_size, &Input_device);
if (err || (Input_device == kAudioDeviceUnknown)) {
return self;
}
err = AudioDeviceAddIOProc(Input_device, recordCallback, NULL);
if (err) {
return self;
}
property_size = sizeof(AudioDeviceID);
err = AudioHardwareGetProperty(
kAudioHardwarePropertyDefaultOutputDevice,
&property_size, &Output_device);
if (err || (Output_device == kAudioDeviceUnknown)) {
return self;
}
err = AudioDeviceAddIOProc(Output_device, playCallback, NULL);
if (err) {
return self;
}
Pcms = (float*)malloc(Max_units * sizeof(float));
}
NSLog(@"Set up audio");
return self;
}
/************************************************************************
* METHOD: awakeFromNib
* INPUT:
* OUTPUT:
* PURPOSE:
* awakeFromNib() is called after all the objects from the NIB
* have been initialized and connected. Now we can do some
* additional initialization, which in this case is to simply
* change the behavior of the two buttons. The NSToggleButton
* type causes the buttons to change from the original state
* (state 0) with the original text displayed to its alternate
* state (state 1) with the alternate text displayed. The button
* keeps changing between states each time we press the button.
*
* We make this change in this example to use the same button
* to start and stop either the recording or the playback of
* the data. For example, when the "Record" button is pressed,
* its text switches to "Stop". Pressing this button a second
* time stops the recording and returns the text to "Record".
************************************************************************/
- (void)awakeFromNib
{
[play_button setButtonType: NSToggleButton];
[record_button setButtonType: NSToggleButton];
}
/************************************************************************
* METHOD: pressPlay
* INPUT:
* sender - object that sent the message (play button)
* OUTPUT:
* PURPOSE:
* pressPlay() is called everytime the Play button is pressed.
* This can be used to start playing the recorded audio (the
* first time the button is pressed), or it can stop the playback
* of the sound (the next time the button is pressed).
*
* We also disable the "Record" button when we start playing the
* sound, and we re-enable it when we hit this button again to
* stop playing the sound. We do this to prevent the recoding
* and playing of data simultaneously.
************************************************************************/
- (IBAction)pressPlay:(id)sender
{
OSStatus err;
if ([sender state] == 1) {
[record_button setEnabled: NO]; // disable the record button
Units_played = 0; // reset index to beginning
// start play callback thread
err = AudioDeviceStart(Output_device, playCallback);
if (err) {
NSLog(@"Could not start output device");
return;
}
}
else {
[record_button setEnabled: YES];// re-enable the record button
// stop the play callback thread
err = AudioDeviceStop(Output_device, playCallback);
if (err) {
NSLog(@"Could not stop input device");
return;
}
}
}
/************************************************************************
* METHOD: pressRecord
* INPUT:
* sender - object that sent the message (record button)
* OUTPUT:
* PURPOSE:
* pressRecord() is called everytime the Record button is pressed.
* This can be used to start recording audio (the first
* time the button is pressed), or it can stop the recording
* the sound (the next time the button is pressed).
*
* We also disable the "Play" button when we start recording the
* sound, and we re-enable it when we hit this button again to
* stop recording audio. We do this to prevent the recoding
* and playing of data simultaneously.
************************************************************************/
- (IBAction)pressRecord:(id)sender
{
OSStatus err;
if ([sender state] == 1) {
[play_button setEnabled: NO]; // disable Play while recording
Units_recorded = 0; // reset index to beginning
// start recording callback thread
err = AudioDeviceStart(Input_device, recordCallback);
if (err) {
NSLog(@"Could not start recording device");
return;
}
}
else {
[play_button setEnabled: YES]; // re-enable Play button
// stop the recording callback thread
err = AudioDeviceStop(Input_device, recordCallback);
if (err) {
NSLog(@"Could not stop recording device");
return;
}
}
}
@end
Todd
_______________________________________________
cocoa-dev mailing list | email@hidden
Help/Unsubscribe/Archives:
http://www.lists.apple.com/mailman/listinfo/cocoa-dev
Do not post admin requests to the list. They will be ignored.