VoIP, handling ALaw encoded audio
VoIP, handling ALaw encoded audio
- Subject: VoIP, handling ALaw encoded audio
- From: Marcus Ficner <email@hidden>
- Date: Thu, 16 Feb 2012 10:17:03 +0100
Hello,
I'm implementing the audio part of a VoIP app on the iPhone.
We need to handle an ALAW encoded audio stream which we receive packed in RTP packages from an UDP socket.
I have some issues finding the right AU Design Pattern and some questions about the decoding of the data.
Initially, I thought, my graph should include a Voice Processing Remote I/O, a Mixer Unit with a render callback as an input for the network stream and a Generic Output unit.
So my overall connections look like this:
Microphone -> Input (VP I/O) -> Generic Output -> Network
Network -> Render Callback -> Mixer -> Output (VP I/O) -> Speaker
This does not work, because, I guess, it is not possible to have two AUs from the type kAudioUnitType_Output, right?
Would it be sufficient to grab the recorded audio buffers from the recording callback as Michael Tyson does in this example: http://atastypixel.com/blog/using-remoteio-audio-unit/ ?
Focussing on just the incoming audio data from network I'm stuck at the point where I need to decode the audio.
The payload in the RTP packages has always 320 bytes, hence 40 ms of audio, since we have the ALAW encoded audio with 8000 Hz / 8 Bit.
In the render callback which feeds the mixer input (this pattern is taken from the MixerHost example code by Apple) I'm calling
AudioConverterFillComplexBuffer which calls another callback function for decoding the audio.
Technical Note TN2097 (https://developer.apple.com/library/mac/#technotes/tn2097/_index.html) uses this pattern. Would it work in my case?
The ASBD for the source stream is set with:
source_AudioStreamBasicDescription.mFormatID = kAudioFormatALaw;
source_AudioStreamBasicDescription.mFormatFlags = kAudioFormatFlagsAreAllClear;
source_AudioStreamBasicDescription.mChannelsPerFrame = 1;
source_AudioStreamBasicDescription.mSampleRate = _sampleRate;
UInt32 propertySize;
AudioFormatGetPropertyInfo(kAudioFormatProperty_FormatInfo,
sizeof(kAudioFormatProperty_FormatInfo),
NULL,
&propertySize);
AudioFormatGetProperty(kAudioFormatProperty_FormatInfo,
0,
NULL,
&propertySize,
&source_AudioStreamBasicDescription);
When I log the ASBD I get the following. Is it correct for ALAW encoded audio?
Sample Rate: 8000
Format ID: alaw
Format Flags: 0
Bytes per Packet: 1
Frames per Packet: 1
Bytes per Frame: 1
Channels per Frame: 1
Bits per Channel: 8
The ASBD for the AU Graph:
destination_AudioStreamBasicDescription.mFormatID = kAudioFormatLinearPCM;
destination_AudioStreamBasicDescription.mFormatFlags = kAudioFormatFlagsAudioUnitCanonical;
destination_AudioStreamBasicDescription.mBytesPerPacket = bytesPerSample;
destination_AudioStreamBasicDescription.mFramesPerPacket = 1;
destination_AudioStreamBasicDescription.mBytesPerFrame = bytesPerSample;
destination_AudioStreamBasicDescription.mChannelsPerFrame = 1;
destination_AudioStreamBasicDescription.mBitsPerChannel = 8 * bytesPerSample;
destination_AudioStreamBasicDescription.mSampleRate = _sampleRate;
Log output:
Sample Rate: 8000
Format ID: lpcm
Format Flags: C2C
Bytes per Packet: 4
Frames per Packet: 1
Bytes per Frame: 4
Channels per Frame: 1
Bits per Channel: 32
The converter is initialized with:
err = AudioConverterNew(&source_AudioStreamBasicDescription, &destination_AudioStreamBasicDescription, &converter);
Here's the code of the two callbacks:
static OSStatus inputRenderCallback (void *inRefCon,
AudioUnitRenderActionFlags *ioActionFlags,
const AudioTimeStamp *inTimeStamp,
UInt32 inBusNumber,
UInt32 inNumberFrames,
AudioBufferList *ioData)
{
OSStatus err = noErr;
AudioHandler *THIS = (__bridge AudioHandler *)inRefCon;
err = AudioConverterFillComplexBuffer(THIS->converter,
converterInputDataProc,
inRefCon,
&inNumberFrames,
ioData,
0);
return err;
}
At the moment inNumberFrames is 256. Should inNumberFrames match the above mentioned 320 bytes of incoming audio?
What is the best way to buffer the network stream?
static OSStatus converterInputDataProc(AudioConverterRef inAudioConverter,
UInt32* ioNumberDataPackets,
AudioBufferList* ioData,
AudioStreamPacketDescription** outDataPacketDescription,
void* inUserData)
{
// TODO: fill ioData
}
I'm not quite sure what to do in the converterInputDataProc.
I hope that you can follow my explanations...
and that you can give me some general hints about handling the audio buffers and if I'm in any way on a succesful course with my setup.
Thanks in advance!
Marcus Ficner
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Coreaudio-api mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden