Header image  
by Ralph Glasgal
line decor
Home Tutorials Tech
Kudos and
Demos Bio Free Ambio
Glossary The Home
Concert Hall
Rec Engineers
FAQ/Forum Links Contact us
line decor

International Tonmeister Symposium
Oct. 31, 2005 Schloss Hohenkammer

Improving 5.1 and Stereophonic Mastering/Monitoring
by Using Ambiophonic Techniques

By Ralph Glasgal, Ambiophonics Institute, 4 Piermont Road Rockleigh, New Jersey 07647 USA


It is desirable that clients judging a recording at a session or a mastering engineer evaluating mic balances, panning algorithms, center channel level, virtual sound positioning, or ambience levels have a control room monitoring system that is uncompromised by the inherent defects of the stereo triangle or the 5.1 speaker array. Keeping the ITDs, ILDs, and pinna cues, captured by the microphones, intact when a recording artist auditions the raw session or later during mastering, increases the odds of early artist approval and provides a more consistent approach to evaluating any subjective postprocessing. It is also suggested that any rear ambience channels sound more musical if convolved using the latest libraries of 3D hall/theater impulse responses than attempting to record them live. These convolved surrounds should be compared with the rear mic signals if such have been obtained during an acoustic recording session in a concert hall, opera house, or church.

1. Stereophonic versus Binaural Monitoring

All human sound localization, with the eyes closed, is based on the clues provided by interaural time differences between the ear canals, interaural level differences between the ear canals and the one and two eared pinna functions. A single pinna can act as a direction finder for sounds with energy above 800 Hz or so. This is why an individual with hearing in only one ear can function almost normally. There are also dual pinna- direction-finding functions that allow localization to within half a degree, even when there is no ITD or ILD, if complex higher frequencies or transients are present. The ITD and the ILD function really well only for signals with energy below 1000 Hz. Thus where complex sound fields such as music are involved, localization is degraded if any of these parameters are missing or distorted by the recording or the reproduction method. Ideally all the three localization cues, ILD, ITD and Pinna, should be present and all be in agreement to provide physiological verisimilitude and thus a less strained monitoring experience.

Unlike everyday binaural hearing, the ability to detect the sonic illusion of phantom images between the speakers of the stereo triangle or the two frontal triangles of 5.1 differs greatly from individual to individual. Head size, pinna shapes, and other genetic aspects of an individual’s hearing mechanism vary to the same extent that individuals differ in their ability to see optical illusions. Thus expecting musicians or clients to hear an adjustment a record producer makes in the same way the producer heard it is often unrealistic. But if the track being monitored is converted to a binaural-like or everyday hearing format that does not rely on stereophonic sonic illusion imaging, then all monitoring parties will likely hear the same thing and will be better able to agree on what needs to be modified. Later, such modifications will be more likely to be appropriate for a larger number of later home buyers even if they listen via a stereo triangle or 5.1 arrangement that is nothing like the monitoring system.

Unfortunately, neither the 60 degree stereo triangle nor the two 30 degree side by side triangles of 5.1 are capable of preserving all the localization cues that have been captured by the recording microphone. That is, most stereo or surround microphone arrays almost always gather more ILD and ITD than is ever heard in the monitoring room. Thus when adjustments are made in channel balance, spot mic balances, panning controls, equalization, etc. or even when a take is played back for a client, decisions are not made with all the mic captured cues being present and audible. Thus unwise adjustments may be made to compensate for monitoring anomalies that are unique to the control room system or to the ears of the monitoring engineer or his client. This is true both for recordings made with microphones or electronic music made with virtual sound software. In the following discussion we will consider a stereophonic system, but the same reasoning applies to the LCR part of the 5.1 methodology.

2. Stereophonic Monitoring Pitfalls

We consider now several combinations of common microphone arrangements comparing what is captured and then what is generated during monitoring. In figure 1 a pair of slightly more than head spaced omnis records an ITD of approximately 900 microseconds for an instrument way off to the side. However, when played back over speakers spaced +/- 30 degrees the ITD sensed is reduced to 220 microseconds and thus due to the precedence effect, the cello moves from 75 degrees to 30 degrees. This may superimpose the cello over the woodwinds and your conductor will not like it. Additionally there are two audible early reflections added in reproduction that are not part of the recording. Omnis are used here for clarity, but subsequent figures show no customary mic arrangement is immune to such anomalies.

In figure 2, the cello is at 25 degrees and its recorded ITD of 200 microseconds is preserved in monitoring. The recorded ILD is 0 dB but the stereo triangle generates an ILD of about 6dB which has not been recorded, at least for the higher notes of the cello where the head shadow is significant, and similarly for violins and violas in these mid- side positions. There is also a strong early reflection created at the far ear that is delayed by over 200 microseconds and so is not well merged with the direct sound. Such a reflection is probably too frontal to enhance envelopment but may cause image widening.

Fig 1
Figure1 - stereo or 5.1 crosstalk distorts large interaural time differnces (ITD) when monitoring.
Figure 2 - crosstalk introduces a false early reflection and a spurious ILD of 6 dB.

In figure 3 coincident cardioids or Blumlein mics are used to record an oboe at the far edge of the stage. In this case the level difference recorded is possibly 10 dB. There is, of course, no recorded time difference. However, when one listens to the oboe, flute, piccolo or trumpet in 800Hz range, via the usual stereo monitoring system this large recorded ILD is reduced to 2 dB and a spurious ITD of 220 microseconds appears. Thus the instrument is heard at 30 degrees rather than 75 degrees and many instruments may appear to be lumped together.

Fig 3
Figure3 - stereo or 5.1 crosstalk distorts mid-frequency interaural level differnces (ILD) when monitoring.
Figure 4 - for central sources at mid-frequencies, monitoring in stereo creates two spurious ITDs that cause combing.

In figure 4 the main mic records no level or time differences for a wideband central instrument. But upon reproduction at the console, there are two ITDs or two early reflections depending on how you view them. But more damaging is the combfiltering or the peaks and dips that occur if you move your head side to side. While not usually audible as changes in pitch or overtones, this combing causes level changes that generate ILDs at some frequencies but not others so that an instrument can appear to be off center for some notes. This combing of central sources also mimics pinna direction finding patterns further confusing localization. This combing characteristic is probably the primary cause of listeners being able to detect something is canned rather than live even when only a single instrument or voice is recorded outdoors. The rule is that a small single sound source such as a voice or harmonica is best reproduced via a single speaker. This is one reason why a mono center speaker for movie dialog is better than stereo.

In figure 5 we assume that a velocity pair recording a piccolo at the far side of the stage only outputs an audible signal on one channel. This could produce a normally large ILD upon reproduction. However, the pinna and the head shadow engendered ILD and ITD localize this monophonic signal to the loudspeaker as in everyday azimuth perception and the stage is again limited to the angle between the speakers which may unconsciously disturb the client or conductor.

Fig 5
Figure5 - stereo triangle limits stage width perception at higher frequencies when monitoring.
Figure 6 - high frequency sources are difficult to localize when monitoring in stereo or 5.1.

In figure 6, a central high frequency source is recorded and naturally has equal left and right recorded signals. Upon monitoring with speakers at 30 degrees, the pinna direction finders sense the higher overtones off to the side but the ILD is zero so the brain localizes the sound to the center, but this mechanism, like that for optical illusions, does not satisfy completely. Small head motions can also inspire doubts as to the high fidelity of the system.

It is clear that different types of recording microphones react differently with various loudspeakers that differ in crossover networks, number of drivers, time alignment, and directionality in largely unpredictable, undetected or unanticipated ways. So, in general, for a wide range of microphone arrangements, instruments, and stage locations, monitoring in stereo will inevitably introduce faults or prejudices which may lead to editing decisions which are of doubtful validity and which other listeners with quite different speakers and ears may later find objectionable.