Re: Extract data from AIFF file to compute FFT
Re: Extract data from AIFF file to compute FFT
- Subject: Re: Extract data from AIFF file to compute FFT
- From: Brian Willoughby <email@hidden>
- Date: Mon, 4 Aug 2008 01:55:39 -0700
Your hunch about AIFF versus FFT is well-founded. AIFF has 16-bit or
24-bit fixed point numbers packed (i.e. 2 or 3 bytes each sample)
into the sound data chunk, and that chunk does not always appear at a
predictable location in the file. The FFT requires floating point
numbers, usually 32-bit (4 bytes). You would use CoreAudio API such
as AudioFile or ExtAudioFile to open an AIFF and convert it to 32-bit
float format. Those API would give you buffers of audio that are
well-suited to pass on to vecLib API.
I'm not sure what you mean by (with main_frequency = FFT(some_datas
[]); )...
An FFT does not produce a single result, such as a "main frequency."
An FFT produces, usually, 1024 complex results, meaning 2048 values.
Each result corresponds to a different frequency bin. Even if you do
an FFT, you still have to interpret the thousands of results and use
some kind of non-trivial algorithm to map the FFT results to tell the
user they are 'In tune' or 'sharp' or 'flat' - either for one string
at a time, or possibly for all strings at once if you're clever.
Using an FFT does not necessarily make this any easier.
Basically, a 1024-point FFT is equivalent to multiplying the input
signal by 1024 different sine waves, each at a different frequency
division. But the frequencies are related to the sample rate, and
are not necessarily related to the frequency that you want to tune
to, such as 440 Hz. For example, if you are sampling an audio input
at a common rate like 44,100 Hz and calculate a 1024-point FFT, you
will end up with a result bins for frequencies of 436.6337 Hz and 441
Hz and 445.4545 Hz among thousands of other frequencies nowhere near
your desired value. The closest you can get is thus about 441 Hz,
which is, not surprisingly, an even multiple of the 44,100 Hz
sampling rate. The other frequencies are rather large jumps.
You would be better off calculating a precise 440 Hz sine wave into a
working buffer, and then using math similar to an FFT, but much
faster, to compare your input to that frequency. In other words,
instead of calculating 1024 results, you would calculate only one
complex result. An FFT is only fast if you are using all 1024
results, but when you only need 1 result, or 8 at the most, then the
FFT is not faster. There is also the consideration that none of
those 1024 results are the exact ones you need, so it's really a bad
idea to do that much calculation. FFT is very efficient, though, so
if you actually needed at least a few hundred of the results that it
produces, then it would actually be quicker to do the FFT and ignore
the excess results. In your case, you only need 6 or 8 results at
the most, and it would be preferable that they were precise frequencies.
Simpler tuning algorithms just look for zero transitions and measure
the period of the waveform. Such an algorithm is susceptible to
overtones which cause false triggering at the octave frequency,
especially with strings where the first harmonic can be stronger than
the fundamental. There are other algorithms for tuning which are
probably too complex to explain here.
Good luck. I suggest that you look for articles focused on tuning
and pitch detection.
Brian Willoughby
Sound Consulting
On Aug 3, 2008, at 22:36, Francois Baronnet wrote:
Thanks for your quick answer.
There are some points that need some explanations :)
2008/8/4 Brian Willoughby <email@hidden>
FFT is not the correct choice for tuning, especially if the sample
rate is not a precise multiple of the desired tuning frequency.
I thought that if I wanted to tune A4, I would just have to record the
sound, compare it's main frequency to 440 Hz (with main_frequency =
FFT(some_datas[]); )...
There are much faster ways to do this without calculating hundreds
of frequency bins for only 4, 6, or 8 strings.
Without FFT?
If you want to do an FFT on real data, not complex, then
investigate the vecLib framework (within Accelerate) for the
functions which convert real to complex, and also investigate the
FFT functions which take real data instead of complex.
Ok, I will have a look to this vecLib framework.
But there is still something puzzeling me... Here is the structure of
a sound data chunk (AIFF):
http://www.cnpbagwell.com/aiff-c.txt
The Sound Data Chunk contains the actual sample frames.
#define SoundDataID 'SSND' /* ckID for Sound Data Chunk */
typedef struct {
ID ckID; /* 'SSND' */
long ckDataSize;
unsigned long offset;
unsigned long blockSize;
char soundData[];
} SoundDataChunk;
The sound "stuffs" are in soundData[], but how do they look like? I
mean, is it like in "Accelerate" sample?
Signal[i] += sin((i*F.Frequency[0] / SamplingFrequency + Phase) *
TwoPi);
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Coreaudio-api mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden