Re: ExtAudioFileSeek and ExtAudioFileTell?
Re: ExtAudioFileSeek and ExtAudioFileTell?
- Subject: Re: ExtAudioFileSeek and ExtAudioFileTell?
- From: William Stewart <email@hidden>
- Date: Wed, 1 Apr 2009 19:36:15 -0700
Ext Audio File will read this information from the file, so if it is
stored correctly in the file we should be able to deal with it.
If you do afinfo on the file, it will report these numbers (priming
frames is the one you are interested in)
If you think the file is correct and we're doing the wrong thing,
please file a bug and include the file so we can take a look at it
Bill
On Apr 1, 2009, at 9:00 AM, James Chandler Jr wrote:
Thanks very much, Stephen!
James Chandler Jr.
On Apr 1, 2009, at 4:20 AM, Stephen Davis wrote:
[warning: off-the-cuff answer written while dosed with cold
medicine so likely to contain errors but hopefully will give you
enough info to stop beating your head against the wall of confusion
and make some progress, or at least give you enough info to ask
more questions]
Compressed files like AAC and MP3 have one or more packets at the
beginning of the file that represent the internal delay through the
psychoacoustic encoder and, as such, are supposed to be removed
from the resulting decoded sample frames before getting sent to the
destination. The decoder needs to be fed these packets to prime
the pump, as it were, but the sample frames involved don't
contribute to the final output. To clarify, a non-integral
multiple of the frames-per-packet value should get dropped from the
beginning of the file and that value varies by encoder. The packet
size for AAC is most often 1024 frames per packet and I believe the
"drop this many leading frames" value for the Nero encoder in
question is 1600 while the Apple encoder is usually 2112. BTW,
other versions of the Nero encoder have used 1024 as the value.
Historically speaking, the Nero encoder has used slightly "illegal"
mechanisms (in the strict MPEG-4 sense b/c there is no official
mechanism that works reliably, not to start a flame war on *that*
subject) to convey this information so 3rd party software
attempting to interpret the information might be getting it wrong.
In this case, the AudioFile implementation is a 3rd party
interpretation. Note that the iTunes gapless information storage
method is also "illegal".
I don't have the headers in front of me at the moment but there is
probably a property you can query to get the audio file packet
table info which specifies the number of valid sample frames in the
file, the encoder delay value (e.g. 1600) and a corresponding
encoder "drain" value for the end of the file which is the number
of sample frames at the end which should be cut off early.
Hope that made some sense. I'm not sure if what you're seeing is a
bug in the implementation or if you just need to account for the
additional gapless info to get what you want. You might want to
try with an .m4a file generated from the WAV/AIFF source by
afconvert or iTunes and see if those files behave properly with
your code as it is now.
hth,
stephen
On Mar 31, 2009, at 11:15 PM, James Chandler Jr wrote:
Hi yet another dumb question. Mac OS 10.5.6 Build 9G55. Same
behavior noted on multiple Macs.
We have a music app which splices small pieces of source audio
files into memory-resident composite multi-tracks. Works great for
WAV and AIF, and ALMOST works great for M4a. The source audio
tracks are huge and we usually deliver the source tracks in
compressed format. WMA on PC, and planned M4a audio files for Mac.
ExtAudioFileSeek documentation says:
"Sets the file’s read position to the specified sample frame
number. A subsequent call to the ExtAudioFileRead function returns
samples from precisely this location, even if it is located in the
middle of a packet."
The seek does sample-accurate positioning, but with M4a there are
fixed offsets from the true requested position.
I have tested several encoders' M4a files but the current test
M4a's were all compressed from WAV by the NERO MP4 command-line
codec.
Reported symptoms may vary if seeking into files made by other
codecs. I want to discover a way to accurately seek into ANY M4a
file. It would be a bummer to only be able to seek into 'special
blessed M4a files'.
The behavior is difficult to clearly explain. Apologies if I don't
explain it well.
Our code, ofter opening the ExtAudioFile, does an
ExtAudioFileSeek, and then it does a read loop using
ExtAudioFileRead until the desired number of frames have been
retrieved. Each ExtAudioFileRead returns 16384 frames until the
final partial buffer read to satisfy the request. The code appears
to be correct, and functions perfectly on WAV and AIF.
Seeking within < 1024 frames of file beginning gives different
symptoms than ExtAudioFileSeek with inFrameOffset >= 1024.
The behavior is identical on all the Nero-encoded M4a test files--
If the ExtAudioFileSeek inFrameOffset >= 1024, then the seek and
subsequent reads will actually load into memory the sample frame
(inFrameOffset + 1600).
If I seek to sample 44100, it actually starts reading at (44100 +
1600). If I ask it to seek to sample 441,000, then it actually
starts reading at (441,000 + 1600).
With the Nero M4a files, as long as I seek somewhere past 2624
frames, I can get the sample I REALLY want, by REQUESTING
[DesiredSampleFrame - 1600]!!!
Debugging with ExtAudioFileTell functions, I get results like this:
//in the case of DesiredSampleFrame >= 1024
ExtAudioFileSeek(FileRef, DesiredSampleFrame); //input
DesiredSampleFrame
ExtAudioFileTell(FileRef, ActualSampleFrame); //returned
ActualSampleFrame = (DesiredSampleFrame - 2112)
ExtAudioFileRead(FileRef, theNumFrames, FillBufList); //returned
theNumFrames = 16384
ExtAudioFileTell(FileRef, ActualSampleFrame); //returned
ActualSampleFrame = (Original DesiredSampleFrame + 16448)
The numbers almost kinda sorta make sense, but not quite.
Is there a way to reconcile Tell against Seek to somehow
guestimate the real position, and always load audio from the
desired offset? Our audio output has surprising degradation of
'musical feel' even with 1 ms of random slop on individual file
seeks. I wouldn't expect such small slop to make an audible
difference, but it is easily audible.
For this set of Nero-encoded files, I can fudge the requested
numbers to always get the desired results, but the magic numbers
may vary with differently-compressed M4a files.
With the Nero files, I can reliably get the desired sample with
this fudging method--
If DesiredSampleFrame >= 2624, then I request (DesiredSampleFrame
- 1600)
Otherwise, I request sample 512, and memory move down the returned
audio, by DesiredSampleFrame.
That is the other ugly thing-- If I request sample 0, the
beginning of the file will load to location 512 in memory.
If I request sample 512, the beginning of the file will load to
location 0 in memory.
If I request sample 1023, then sample 511 will load to location 0
in memory.
If I request sample 1024, then sample 2624 will load to location 0
in memory. Eeek! Ack!
There is one consistent offset for all seeks < 1024, and a
different consistent offset for all seeks >= 1024. Maddening.
Thanks for any ideas.
James Chandler Jr.
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Coreaudio-api mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Coreaudio-api mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Coreaudio-api mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden