On Feb 11, 2009, at 12:21 PM, Marco Papa wrote: I have indeed seen several mentions in the iPhone Dev. Forum of the fact that the audio file must be "an exact multiple of the packet size", otherwise the player will fill the missing portion of the last packet with "silence" when looping, but this is the first time I see mention of "enough context before the beginning and after the end of the loop." What is that?
The encoder doesn't process samples in isolation. It's always looking at the past history of the waveform, up to some (fairly short) horizon. So if you cut a snippet from an audio track and encode it by itself, the output data (and resulting played-back waveform) won't be identical to that section of the original track, because there's no prior context.
So ideally, when encoding a loop, you want to start the encoder near the end of the loop (a distance longer than its horizon), let the loop run until the second time it gets to the end, and then discard the encoded data for the audio before the start of the loop. That way the initial samples of the loop get encoded using the context acquired from the previous [i.e. last] samples, not from a clean slate.
If that isn't clear, think of a simpler example of trying to apply a 200ms echo effect to a loop. If you just take the bare loop and run it through the echo, it won't sound right when played back, because the first 200ms of the loop won't have any echo in it! Instead, you have to apply the echo to the running loop, so that the last 200ms gets echoed onto the first 200ms. Now, for "echo" substitute "secret MP3 encoding sauce" and you've got it.
—Jens |