Re: embedding sound problem
Re: embedding sound problem
- Subject: Re: embedding sound problem
- From: James Chandler Jr <email@hidden>
- Date: Wed, 7 Jul 2010 16:42:24 -0400
Hi Waverly
The phrase "close enough for rock'n'roll" is a slang phrase which means a solution that may be far from perfect, but it is good enough to suffice. Another common variant is "close enough for jazz." But jazz has generally high performance standards, and so the jazz variant of the phrase doesn't imply enough slop in the tolerated imperfect solution <g>. Also common-- "close enough for government work."
In dynamic range compressors, there is the common 'compressor' which does not adjust audio below a threshold, but it reduces dynamic range above a threshold. For instance, perhaps a compressor would not affect phrases in a signal whose current level is below -12 dB, or whatever is selected as the threshold.
Then there is Automatic Gain Control, which has a much wider range of signals processed.
http://en.wikipedia.org/wiki/Automatic_gain_control
There isn't any threshold in an AGC or leveling amp. The process just attempts to keep the output at a desired level regardless whether the input is loud or soft.
AGC as often implemented, such as cheap portable tape recorders of the past-- It would use fairly long attack and release constants so that short-term dynamics of less than one or two seconds would be preserved, but in the long term very quiet phrases in the audio would be about the same output level as very loud phrases in the audio.
Your envelope Attack and Release time constants can affect what the compressor or AGC is responding to. With short attack time constant, it would tend to work as a peak-responding AGC, and level all peaks in the program materal. With longer attack and release time constants, it would allow short-term amplitude variations, but tend to work as an average-responding AGC.
So once you have an AGC algorithm working (one AGC to operate on the source, and another AGC to operate on the mask)-- You could easily experiment with the time constants to see which works better-- Leveling against peaks (short time constants), or leveling against short-term averages (medium time constants), or leveling against long-term averages (long time constants).
Perhaps for your use, an attack time in the ballpark of 20 to 100 ms, and a release time in the ballpark of 1 or 2 seconds? It is just a guess, based on common AGC settings for old electronic gear. Maybe something different would be better.
You could also do RMS sensing for the control envelope, but I'd guess that average sensing might be 'close enough for rock'n'roll'.
One other trick which MIGHT be useful-- It might be beneficial to use frequency-shaping to feed the envelope detection. It may make the auto-level more 'constant to the ear' to use an inverse fletcher-munson curve or some other kind of frequency weighting. For instance, on USA sound level meters, one often has a choice of flat, A weighting, or C weighting. Also, a choice of fast or slow response.
http://en.wikipedia.org/wiki/Sound_level_meter
http://en.wikipedia.org/wiki/A-weighting
Perhaps it could make the digital algorithm more 'scientifically sound' if one could steal weightings and time constants from one of the sound level metering standards?
Apologies if these ideas are too far in left field. Actually applying these ideas is not uber-complicated, once one decides what should be done.
jcjr
On Jul 7, 2010, at 2:31 PM, Edwards, Waverly wrote:
>
>>> I wonder what would be 'close enough for rock'n'roll'? Would it be too...
>
> Rock-n-roll would not be used because of the high dynamic range. I would need to be a source that did not vary in a great amount.
>
>>>
> If you had some well-defined method of doing it, then it ought to be valid for instance comparing a -25 dB mix against a -20 dB mix or whatever, even if the method isn't exactly 'perfect'? Whatever a perfect method might be.
> <<
>
> Unfortunately, I do not have a well-defined method in the digital realm. Using physical objects such as loudspeakers, there is a well-defined way but this research needs to be better controlled and measurable. I have done quite a bit of research on the subject but how is done digitally is not clear to me at this time.
>
> I will work towards using average magnitude but I suspect there is a better way. I just do not know it yet.
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Coreaudio-api mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden