Lists

Terms and Conditions
Lists hosted on this site
Email the Postmaster
Tips for posting to public mailing lists

Re: Iterating through audio data

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Iterating through audio data

Subject: Re: Iterating through audio data
From: James Chandler Jr <email@hidden>
Date: Thu, 26 May 2011 09:02:17 -0400
Importance: Normal

Hi GW

Three applications which come to mind for displaying audio waveforms-- A likely incomplete list--

* A real-time oscilloscope view which displays zoomed-in data "just in time".

Most oscilloscopes have multiple "trigger" features to synchronize wave displays so it doesn't look like a bunch of garbage on the display. A common one is to trigger each display sweep on positive zero-crossings, so that repeating waves will show a stable picture of the wave. Other common options will trigger on a negative zero-crossing. Or trigger when the audio passes some arbitrary user-set threshold in a positive or negative direction.

In addition to a trigger function for each display frame, there is the oscilloscope variable-sweep feature.

For instance if the sweep is set fast to display 1 ms of data per frame, then you would only be concerned with interpolating and plotting 44.1 samples in each display frame. Each display frame would start at an offset in the bigger audio data stream wherever you can find that the data meets your simulated scope's trigger conditions. With a fast sweep, and the program only updating the screen maybe 10 to 60 times per second, the majority of the audio data would not be displayed. You would only be displaying occasional little pieces of the audio, selected to meet the trigger conditions and thereby present the user with a stable waveform view.

Similarly if the oscilloscope sweep is set slower to perhaps 10 ms, 100ms, or 1 second, then the task would be to interpolate whatever small or larger hunks of audio you want to display, squeezed into the pixel width of your oscilloscope window on-screen,. You would still begin each display frame offset into the audio according to the trigger conditions.

* An overview of a single wave file, such as an audio jukebox player or a stereo editor window.

This can be pretty easy on modern computers (but less simple on tiny computers with less cpu power and memory). Typical consumer computers and laptops have sufficient memory to load a single song to memory so you don't have to play direct from disk. It is so much simpler if the entire audio file can be in memory.

You may want to pre-calculate cached overview files as was described in earlier emails. But typical computers, with a single audio file in memory, are quick enough to do the graphic display in real-time off the audio data in memory.

The graphic display window is just a "window" into a part of the audio file. For full zoomed-out display of the entire song-- Divide the number of samples in the song by the number of horizontal pixels in the audio display screen.

For instance a 3 minute song at 44.1K sample rate would be 793800 samples. If the display window is 1000 pixels wide, then each horizontal pixel needs to represent 7938 samples. There may be a better way, but I would be inclined to loop thru the file in blocks of 7938 samples. For each block I would find the Max and Min values in the block and draw a vertical line on the screen between the Max and Min. So after you are done you have a 1000 pixel display showing the Max and Min values of the audio in about as good resolution you can get zoomed out that far.

Perhaps a Max and Min of the average or RMS amplitude would make more sense in some situations. But people are commonly concerned about clipping, and so the absolute max and min can be important to know.

When the user zooms in closer, you just calculate how many samples will fit in the entire window, and calculate how many samples to batch into each pixel's vertical display line on screen. Drawing the audio overview is quicker zoomed-in because you don't have to iterate thru as much audio data to fill the screen.

When the user finally zooms in so close that there are fewer than 1 samples per pixel, then you have to start drawing "connect the dots" waveform lines interpolating a smooth line connecting the sample dots.

There are various ways to smooth the line connecting the dots. Some methods are faster and some methods are more accurate. The most ideal would be a method which draws on-screen as similar as possible to what the audio ought to look like after it has passed thru an audio output D/A converter. The best view would look like the band limited continuous signal being output from the soundcard D/A converter.

* Multiple overviews of many tracks, in a typical multi-track program's Track display.

In this case, there is probably too much calculation involved to display all the audio overview data in real-time from the raw data. So it is typical to make overview cached files of some kind. As earlier described in other emails.

One time I used a cache file format which used two bytes to represent the Max and Min of each batch of 256 samples. I would calculate the cache overview file when the user adds an audio file to the project or after the user has recorded a new track. The cache files are typically small enough to keep in memory.

The 256 sample batching is pretty good for zooming in up to about 5.8 ms per horizontal pixel in the display (at a 44.1KHz sample rate). When zooming out, I would use multiple cache blocks to display 1 horizontal pixel on the screen display.

When zooming in closer than 5.8 ms per pixel, the display either begins to look blocky, or you switch to a different method of display. When zoomed in real close, you don't have to traverse a lot of audio data to draw the waveform, so it may be feasible to calculate close-zoomed views directly from the raw audio, as in the stereo-editor case described above.

=====

There are likely smarter ways to skin the cat. Those are just some methods that can work.

James Chandler Jr.

-----Original Message----- From: GW Rodriguez Sent: Wednesday, May 25, 2011 10:58 PM To: Brian Willoughby Cc: email@hidden Subject: Re: Iterating through audio data

Yes I assumed that I wasn't going to plot each sample. My plan was to
start with every 100 samples and see how it looks.

So here's what I'm looking at doing:

// open audio file
// find length in samples of the file
    -(void) getData {
         AudioBufferList tempList;
         for (SInt64 i=1; i < length; i = i+50) {
              Err = ExtAudioFileSeek(audioFileRef, i);
              UInt32 s = (UInt32) i;
              Err = ExAudioFileRead(audioFileRef, &s, &tempList)
              // something to print amplitude
          }
      }

So I'm not sure about the AudioBuffer and how to get the float values (-1.-1.).
Thanks a bunch Brian


GW Rodriguez

_______________________________________________
Do not post admin requests to the list. They will be ignored.
Coreaudio-api mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden


Follow-Ups:

Re: Iterating through audio data
From: Brian Willoughby <email@hidden>


References:  
  >Iterating through audio data (From: GW Rodriguez <email@hidden>)
  >Re: Iterating through audio data (From: Brian Willoughby <email@hidden>)
  >Re: Iterating through audio data (From: GW Rodriguez <email@hidden>)




Prev by Date:
Re: Querying whether transport is playing

Next by Date:
Re: Iterating through audio data

Previous by thread:
Re: Iterating through audio data

Next by thread:
Re: Iterating through audio data

Index(es):

Date
Thread