Header image  
by Ralph Glasgal
line decor
Home Tutorials Tech
Kudos and
Demos Bio Free Ambio
Glossary The Home
Concert Hall
Rec Engineers
FAQ/Forum Links Contact us
line decor

Audio Engineering Society

Presented at the 99th Convention
October 6-9, 1995 New York, NY



The Synthesis of Concert-Hall Sound Fields in the Home

by Ralph Glasgal


As discussed above, the ear has two ways to perceive direction, distance, and therefore space. First, each pinna provides an intra-aural mechanism that allows even a one-eared person to sense direction. Therefore, any sound generated in the home listening room must accommodate the pinna and originate from the proper direction and remain uncontaminated by direction-shifting listening-room reflections. Likewise, the early proscenium reflections should come from the front quadrant and most of the reverberant field energy should come from loudspeakers to the side, rear, and even upper rear. (My own version of Ambiophonics does not attempt to generate discrete early ceiling reflections or reverberation because such echoes reach both ears simultaneously and add little to the quality of either real or simulated concert halls.)

Electronic circuits that attempt to manipulate pinna directional effects are generally of little use in this context because individual pinnae vary so much that, no matter what average pinna/head response curve is adopted, most critical listeners will not hear a realistic enough result, and in any case the comb-filter effects remain uncorrected. So in Ambiophonics we don't attempt to fool Mother Nature. If the simulator generates an early reflection for the right front, then it comes from a right front ambience speaker.

The other main characteristic of a concert-goer's ear that concerns us is the fact that there are two of them. This means that we must be sure that not only the intensity but also the diversity, and the time of arrival of any sound reaching the ears at home is as close as possible to what arrived at the recording microphones. In particular, the best-sounding concert halls are those that maintain as low an interaural cross-correlation (IACC) as possible, not only for direct sound but also for the ambient field. The listening room must therefore have sufficient sound-absorption treatment to avoid increasing the IACC by inadvertent and uncontrolled diffusion.


Since we have only two ears, it seems reasonable that only two signals should need to be recorded. Indeed, it was Blumlein's idea that he could externalize the binaural effect using spaced loudspeakers and some novel microphone arrangements. However, once you give up earphones, the interaural-crosstalk problem destroys the almost perfect but internalized binaural stage image.

Figure 1 shows what a central listener in a concert hall would hear from an off-center instrument on the stage, considering only the direct sound rays. Such a listener receives only one direct sound ray at each ear from any discrete sound source on the stage. If a head-spaced microphone pair is used, and the reproducing loudspeakers are in front of the listener, and the crosstalk barrier is in place, then the rays reaching each ear are not much different from the ones experienced in the hall. For a more centrally located instrument, the rays are identical in amplitude, time, and direction. As the sound source moves to the side, the recorded rays are similar to the concert-hall rays except that the angles of incidence do not match well. However, the use of a dummy head, even if with imperfect pinnae, can correct this minor discrepancy. I can report that in practice it is a rare two-channel recording, no matter how miked, which does not produce a vivid externalized binaural stage effect. Since microphones are placed closer to the performers than a listener in the audience would be, the stage perspective can be exaggerated even if rock-solid and realistic. As indicated below, pinna effects somewhat compensate for the closeness of the microphone by compressing the apparent stage width somewhat.

But of course the concert listener hears other reflecting sound rays from virtually all other directions, and of course a standard two-channel recording system cannot discretely store this additional waveform and directional information. However, for a given concert hall, every sound produced on the stage generates a measurable hall reflection response that can be measured and represented mathematically or stored digitally. So in this sense, the recording does contain all the information required to satisfy the ear, if we are clever enough to deliver the direct sound to the ear intact, and then process the direct sound to produce the ambient field and deliver this indirect sound to the ear intact.

Figure 1 - The recorded rays reaching the ear are identical to those at the recording microphones in intensity and time difference. The frontal angle of arrival at the home ears, however, can cause pinna distortion for sound sources located at the sides. This can be compensated for somewhat when recording by using dummy microphone pinna. The speakers may be moved slightly further apart, but care must be taken that the central images are not impaired and that interaural crosstalk does not reappear.


Blumlein, Atal-Schroeder, Damaske, Cohen, Mori, Matsushita Group, Polk, Carver, and -- definitively -- D. B. Keele, Jr., as well as many others, have recognized that when a two-channel recording is played back through two loudspeakers that form an equilateral or similar triangle with the listener, each such speaker communicates with both ears, producing interaural crosstalk. The deleterious effects of this crosstalk have been greatly underappreciated. For openers, crosstalk is what prevents any sound source from appearing to come from beyond the position of the loudspeakers. This result is intuitively obvious, since if we postulate an extreme-right sound source, we can safely igrrore the contribution of the left loudspeaker. We now hear the right speaker as usual with both ears, and no matter how we turn our heads the sound will always come from the right speaker, as would be true for any normal, discrete sound source encountered in life. However, if we could keep the right-speaker sound from getting to the left ear, then the ear-brain will think that the sound must be very far to the right, well beyond the loudspeaker, since so much less of this sound is reaching the left ear. Unfortunately, there is a limit to how far the image can shift, because if the right speaker is, say, at 30° off the center line, the 30° pinna response does not agree with the 90° interaural intensity difference between the near and far ears, and in practice the pinna wins. Even with the interaural crosstalk eliminated, a full 180" image cannot be easily achieved (Fig. 4). However, Ambiophonics routinely projects a 120° stage, and in practice very few seats in concert halls or opera houses comrnand that wide a view. Also, as indicated above this narrowing effect compensates for the fact that most recordings are made with microphones very close to the stage.

Fig. 2 - Comparison of live concert hall listening geometry with home stereophonic listening practice showing the additional crosstalk sound rays that cause poor stereophonic imaging effects and impinge on the pinna from too large an angle.

A second, perhaps even more deleterious effect is caused by this stereo crosstalk. As D. B. Keele, Jr. has exhausively documented, for centrally located sound sources two equal acoustic signals reach each ear, but one of these signals in the normal stereo listening setup travels about half a head-width, or 300 microseconds longer than the sound from the nearer speaker. This produces peaks and nulls in the frequency response at each ear above 2000 Hz, known as comb filtering. Since the nulls are narrow, and are muddied by even later crosstalk coming around the back or over the top of the head, and since the other ear is also getting a similar but not precisely identical set of peaks and nulls, the ear seldom perceives this comb filtering as a change in timbre. But it can and does perceive these gratuitous dips and peaks as a confusion in angular position. Remember, in real halls, the ear can hear a 1 ° shift in angular position, but not if strong comb-filter effects occur in the same 2-to-10 kHz region where the ear is most sensitive to its own pinna comb-filter effects and interaural intensity diflerences, as discussed above. As long as this wrongful stereo interaural crosstalk is allowed to persist, the sound stage can never be as accurate or as tactile as it should be. As the recorded image moves to the side, the severity of the crosstalk declines, and many observers have noticed that side sound images are more realistic than center phantom images. Perhaps the most basic tenet of Ambiophonics is that stereophonically induced interaural crosstalk must go.

Next Page >>