Re: Iterating through audio data
Re: Iterating through audio data
- Subject: Re: Iterating through audio data
- From: James Chandler Jr <email@hidden>
- Date: Thu, 26 May 2011 09:02:17 -0400
- Importance: Normal
Hi GW
Three applications which come to mind for displaying audio waveforms-- A likely
incomplete list--
* A real-time oscilloscope view which displays zoomed-in data "just in time".
Most oscilloscopes have multiple "trigger" features to synchronize wave displays
so it doesn't look like a bunch of garbage on the display. A common one is to
trigger each display sweep on positive zero-crossings, so that repeating waves
will show a stable picture of the wave. Other common options will trigger on a
negative zero-crossing. Or trigger when the audio passes some arbitrary user-set
threshold in a positive or negative direction.
In addition to a trigger function for each display frame, there is the
oscilloscope variable-sweep feature.
For instance if the sweep is set fast to display 1 ms of data per frame, then
you would only be concerned with interpolating and plotting 44.1 samples in each
display frame. Each display frame would start at an offset in the bigger audio
data stream wherever you can find that the data meets your simulated scope's
trigger conditions. With a fast sweep, and the program only updating the screen
maybe 10 to 60 times per second, the majority of the audio data would not be
displayed. You would only be displaying occasional little pieces of the audio,
selected to meet the trigger conditions and thereby present the user with a
stable waveform view.
Similarly if the oscilloscope sweep is set slower to perhaps 10 ms, 100ms, or 1
second, then the task would be to interpolate whatever small or larger hunks of
audio you want to display, squeezed into the pixel width of your oscilloscope
window on-screen,. You would still begin each display frame offset into the
audio according to the trigger conditions.
* An overview of a single wave file, such as an audio jukebox player or a stereo
editor window.
This can be pretty easy on modern computers (but less simple on tiny computers
with less cpu power and memory). Typical consumer computers and laptops have
sufficient memory to load a single song to memory so you don't have to play
direct from disk. It is so much simpler if the entire audio file can be in
memory.
You may want to pre-calculate cached overview files as was described in earlier
emails. But typical computers, with a single audio file in memory, are quick
enough to do the graphic display in real-time off the audio data in memory.
The graphic display window is just a "window" into a part of the audio file. For
full zoomed-out display of the entire song-- Divide the number of samples in the
song by the number of horizontal pixels in the audio display screen.
For instance a 3 minute song at 44.1K sample rate would be 793800 samples. If
the display window is 1000 pixels wide, then each horizontal pixel needs to
represent 7938 samples. There may be a better way, but I would be inclined to
loop thru the file in blocks of 7938 samples. For each block I would find the
Max and Min values in the block and draw a vertical line on the screen between
the Max and Min. So after you are done you have a 1000 pixel display showing the
Max and Min values of the audio in about as good resolution you can get zoomed
out that far.
Perhaps a Max and Min of the average or RMS amplitude would make more sense in
some situations. But people are commonly concerned about clipping, and so the
absolute max and min can be important to know.
When the user zooms in closer, you just calculate how many samples will fit in
the entire window, and calculate how many samples to batch into each pixel's
vertical display line on screen. Drawing the audio overview is quicker zoomed-in
because you don't have to iterate thru as much audio data to fill the screen.
When the user finally zooms in so close that there are fewer than 1 samples per
pixel, then you have to start drawing "connect the dots" waveform lines
interpolating a smooth line connecting the sample dots.
There are various ways to smooth the line connecting the dots. Some methods are
faster and some methods are more accurate. The most ideal would be a method
which draws on-screen as similar as possible to what the audio ought to look
like after it has passed thru an audio output D/A converter. The best view would
look like the band limited continuous signal being output from the soundcard D/A
converter.
* Multiple overviews of many tracks, in a typical multi-track program's Track
display.
In this case, there is probably too much calculation involved to display all the
audio overview data in real-time from the raw data. So it is typical to make
overview cached files of some kind. As earlier described in other emails.
One time I used a cache file format which used two bytes to represent the Max
and Min of each batch of 256 samples. I would calculate the cache overview file
when the user adds an audio file to the project or after the user has recorded a
new track. The cache files are typically small enough to keep in memory.
The 256 sample batching is pretty good for zooming in up to about 5.8 ms per
horizontal pixel in the display (at a 44.1KHz sample rate). When zooming out, I
would use multiple cache blocks to display 1 horizontal pixel on the screen
display.
When zooming in closer than 5.8 ms per pixel, the display either begins to look
blocky, or you switch to a different method of display. When zoomed in real
close, you don't have to traverse a lot of audio data to draw the waveform, so
it may be feasible to calculate close-zoomed views directly from the raw audio,
as in the stereo-editor case described above.
=====
There are likely smarter ways to skin the cat. Those are just some methods that
can work.
James Chandler Jr.
-----Original Message-----
From: GW Rodriguez
Sent: Wednesday, May 25, 2011 10:58 PM
To: Brian Willoughby
Cc: email@hidden
Subject: Re: Iterating through audio data
Yes I assumed that I wasn't going to plot each sample. My plan was to
start with every 100 samples and see how it looks.
So here's what I'm looking at doing:
// open audio file
// find length in samples of the file
-(void) getData {
AudioBufferList tempList;
for (SInt64 i=1; i < length; i = i+50) {
Err = ExtAudioFileSeek(audioFileRef, i);
UInt32 s = (UInt32) i;
Err = ExAudioFileRead(audioFileRef, &s, &tempList)
// something to print amplitude
}
}
So I'm not sure about the AudioBuffer and how to get the float values (-1.-1.).
Thanks a bunch Brian
GW Rodriguez
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Coreaudio-api mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden