|The Science of Domestic Concert Hall Design|
by Ralph Glasgal
Ambiophonics, 2nd Edition
Replacing Stereophonics to Achieve Concert-Hall Realism
By Ralph Glasgal, Founder Ambiophonics Institute, Rockleigh, New Jersey www.ambiophonics.org
Ambiopoles and Ambiophones
Ambiophonics combines several technologies to produce realistic sound fields and actually does it optimally via two-channel recording media where most classical/jazz/pop music is concerned. The technologies are convolution for hall ambience, speaker correction, front loudspeaker crosstalk and pinna angle error elimination, and an optional but superior recording microphone design and placement. The basic goal of Ambiophonics is to recreate at the home listening position an exact replica of the original concert hall sound field. Ambiophonics does this by transporting the sound sources, the stage, and hall ambience to the listening room. In other words, Ambiophonics delivers an externalized binaural effect, using, as in the binaural case, just two recorded channels but with two front-stage-reproducing loudspeakers and eight or so ambience loudspeakers in place of earphones. Ambiophonics generates stage image widths of up to 180 degrees with an accuracy and realism that far exceeds that of any other 2 channel or multi-channel recording/reproducing scheme. We will now discuss how to reproduce the front stage of a two channel recording without exposing our ears to comb filtering, phantom imaging or major errors in the angle of sound incidence on the pinna and how best to make recordings that take advantage of Ambiophonic binaural reproduction technology.
Making Good on the Promise of Binaural Technology
Since we have only two ears, it seems reasonable that only two signals should need to be recorded. Indeed it was Blumlein's original idea that he could externalize the earphone binaural effect using spaced loudspeakers and some novel microphone arrangements. But once you give up earphones for stereo loudspeakers, the interaural-crosstalk and the arbitrary speaker angle destroy the almost perfect, but internalized (within the skull), binaural frontal stage image and with all the stereo hall ambience now coming entirely from the front, the hall ambience sounds unnatural. Binaural theory says that if you sit in the concert hall with small microphones in your ear canal, record the concert, and then later play it back with in-the-ear canal earphones you will experience an almost perfect "you are there" recreation. The only flaw in this method would be that when you moved your head, while listening or recording, the reproduced stage would rotate unrealistically.
But let us consider, briefly, why this recording method can otherwise produce an awesome reality. First of all, the sound from the stage and the hall during such a personal binaural recording reaches your ear canal (and the imbedded microphones) after being filtered by your pinna and your head shape. Since the playback earphones we are using are an in-the-ear-canal type the sound only passes through the pinna or around the head once. Also the pinna used to make the recording are your own, not those on some dummy head carved in wood or plastic. The two channels are kept separate throughout and the left ear playback earphone signal never leaks into the right ear or vice-versa. Thus we can state one of the basic rules of realistic binaural recording technology. In any binaural recording or reproduction chain there should be one and only one pinna function and it must be your own. There must also be one and only one head shadowing entity but in this case whose head it is is not critical. That the head shadowing function is not as individual as the pinna function can be understood when one realizes that sound passes around the head over the top, under the chin, around the back, and varies as the head is tilted or rotated. Thus the brain is not overly sensitive to the exact shape of a particular head or the exact frequency response of the head shadowing function, within reason.
So let us see how we can make use of this knowledge. Let us assume that we have a two-channel recording made using a dummy head that has no pinna. This dual microphone is sitting fifth row center. Its signals are then recorded and played back over two loudspeakers directly in front of the home listener. Let us assume for the moment that these loudspeakers are like laser beams so that their sound is aimed precisely at the proper ear. In this case the listener hears what the corresponding microphone hears and the sound impacts his own pinna with very little incident angle error for central stage sources. For stage sources that are more to the side, the listener hears the head response transfer function of the microphone head and for most humans this is quite realistic. But now the home listener can rotate his head and the image is stable just as if he were in the concert hall. So this technique is not only equal to but superior to the earphone method considered above. There is a pinna angle error for stage sources toward the extreme left and right but fortunately these are the angles where direct sound has a more or less clear shot at getting to the ear canal directly without extreme pinna filtering and also where nature has compensated for the decrease in pinna sensitivity by making the interaural head shadowing most pronounced providing strong and natural horizontal plane localization. In practice, both IMAX and Ambiophonics easily demonstrate that this binaural technology is exceptionally realistic and does produce wide front stages that even allow the cocktail party effect to be in evidence.
Now the question is how to make a pair of center front speakers behave like sound lasers. There are two possibilities. One is to put a physical wall or panel in front of the listener. This wall extends to within a foot or so of the listener's head and keeps the left speaker from radiating to the right ear and vice-versa. This technique works perfectly and if you are an audiophile and want absolute fidelity without cables or extra processing this is a very inexpensive way to go. You can try it first with a mattress on end, if you want to experiment and have some fun. While I appreciate that the use of a barrier will never find universal acceptance, an understanding of how it works is necessary to an appreciation of what a software version of such a crosstalk avoidance system should accomplish. You can make a barrier out of sound absorbing panels with a cutout at the end of it so that it is possible to sit comfortably at the end of it. The thickness of the barrier is not critical, but should be about six to eight inches wide so that when a listener is seated their right eye cannot see the left speaker and vice versa. The wall extending back toward the space between the speakers is, preferably, made with sound absorbing material. This panel can be thought of as a collimator for most sound except the low bass. It eliminates all stray rays from the right that might be heading left and those from the left that might be heading right. A panel such as this is very effective in dampening higher frequency room reflections since it absorbs rays coming from both room sides.The use of an outdoor reflective barrier to eliminate stereophonic crosstalk was described in 1986 by Timothy Bock and Don Keele Jr. at the 81st Audio Engineering Society Convention. While Ambiophonics uses an absorbent barrier, their results are still largely pertinent. They determined that a listener could be further back from the end of the barrier if the barrier was wider, the speakers closer together, and the listener further from the speakers. Stated as an equation:L=X(H+T)D
Where, in inches, L is the maximum distance a listener's head can be from the barrier, X is the distance from the listening end of the barrier to the position of the speakers, D is the distance between the centers of the speakers H is the distance between the ears, and T is the thickness of the barrier. For a worst case scenario of a six-inch head, a six-inch thick barrier, an eight-foot distance to the speakers, and a speaker separation of three feet (too much) a listener could be as much as 32 inches, almost three feet from the end of the barrier. Thus the use of a barrier does not in any way make listening uncomfortable or claustrophobic.Our own Ambiophonic barrier geometry allows one to be four feet from the end of the barrier, but at the far end of this range one's head must be more precisely centered. With a four-foot space, two in-line listeners can enjoy the enhanced angular image separation at the same time and indeed the front listener acts as a continuation of the barrier for the second listener. If in doubt about the spacing, the eyeball method is very conservative. As long as no part of the opposite loudspeaker is visible from one eye, excellent separation is guaranteed. Sitting too close to the barrier is not only unpleasant but results in a loss of high-frequency response if the barrier is as wide as the head and absorptive.
However, the mainstream Ambiophonic way is to use software and a computer or digital signal processing component to eliminate the crosstalk. I call a pair of speakers that use the public domain software that we have developed to do this, an Ambiodipole. First, although most speakers can be used to form an Ambiodipole, it is best if the speakers chosen are very directional and well matched. A slightly concave electrostatic panel (called an Ambiostat) can actually focus sound well enough that it almost behaves like the laser we have hypothesized. Obviously, if the speakers are focused and time aligned, the software can do its job much better. What the recursive croostalk cancelling software does is generate slightly delayed reversed polarity signals for the speakers to cancel the crosstalk acoustically before it reaches the ear canal. The cancellation is an infinite series process since the crosstalk caused by the cancellation signal also produces crosstalk, which must then be cancelled and so on. If the Ambiopoles were widely spaced, then the crosstalk would have to go around the head and the correction signals would be very difficult to calculate since they would be affected by head position and pinna shape. Thus the front speaker pair should be closer together with about 20 degrees between them so that both the main front speakers emit directly to their onside ears.
Just as it is obvious that a barrier will work better with close together speakers, since speaker proximity makes it easier for the barrier to shadow the appropriate ear, so crosstalk software works better if the speakers are closer together. Ambiodipoles do have a sweet spot limitation although in my experience the sweet spot is larger than that of most well focused stereo or 5.1 systems. In theory if the Ambioipoles, used to form an Ambiodipole, are constructed as panels with a special curved shape then it is possible to enlarge the sweet spot enough to accommodate two or even three listeners. But such a speaker has yet to be constructed.
In stereo if you move back along the median line between the speakers, the stage narrows and becomes mono. If you move forward, you get a hole in the middle and just hear two speakers, one on either side. If you move offside you normally localize to one speaker and so hear mostly just one channel. Similar effects plague 5.1 which is why a center speaker is used to keep the dialog clearly audible. In Ambiophonics, if one moves very close to the speakers, one hears normal stereo instead of Ambio. If one moves back until one hits the rear wall nothing much happens. One can recline, stand, nod, rotate the head, etc. without ill effect. If one moves sideways, one still hears both channels, clearly so a center speaker is never required. Basically, Ambio has a larger listening area than stereo, but when one is not centered one feels deprived in a way not apparent in stereo. In PanAmbio versus 5.1 there is a similar advantage for PanAmbio, in that off center viewers cannot localize to a surround speaker. Also the front and rear stages or sound effects are clearly audible in Panambio no matter where you sit.
Over the years many versions of crosstalk cancelling software and hardware have been promulgated. Among those best known are hardware devices from Lexicon and Carver (Sonic Holography) and software programs from The University of Southampton (the Stereodipole) and The University of Parma. In general all these early attempts had serious flaws that made them unrealistic, phasey, or unstable. Among the flaws was trying to do crosstalk cancellation using speakers still arranged in the stereo triangle. This is doomed to failure because now the amount of crosstalk depends on what happens as the sound from each speaker crosses the head geneerating the crosstalk that one must cancel. With a wide speaker angle the attitudeand shape of the head will change the crosstalk making it difficult to know what the crosstalk actually is. With the speakers close together there is virtually no significant change in the crosstalk as long as the head is between the speakers. Some such systems used an average HRTF (head response transfer functions) to compensate for the head shadow but this almost never works since nobody is average. Other pioneers put the speakers close together but still used HRTFs mostly to get the proximity effect of a bee buzzing close to the ear. But again, in general, the use of HRTFs is counterproductive and not necessary for normal music or movie sound reproduction.
But the most basic flaw in the early crosstalk cancellation methods was that they were not recursive. That is when you cancel crosstalk you must also cancel the crosstalk due to the signal that cancelled the original crosstalk and this process must be continued to inaudibility. As far as is know to this author, the Ambiophonic program known as RACE (Recursive Ambiophonic Crosstalk Eliminator) was the first fully recursive XTC program to run both in PCs and hi-fi components.
The Ambiophonic Institute's published RACE equations were used by Robin Miller, of Filmaker,and Angelo Farina of The University of Parma to develop a version of RACE that could be used in programs like AudioMulch or in VST plugins to drive speaker pairs called called Ambiodipoles. This new software is designed so that almost all two-channel recordings can benefit from being reproduced Ambiophonically. RACE includes adjustments so that one can select one that makes a particular recording sound most realistic. It is hoped that those reading this book will be able to purchase either Ambiophonic system components or PCs that can run this software as well as the software for hall convolution and speaker correction.
Since Ambiophonics is a binaural based system, it does not provide the Blumlein loudspeaker crosstalk signal that amplifies low frequency ILD cues for those recordings made with a coincident microphone arrangement such as the Soundfield or crossed figure eight mics in the M/S (mid-side Blumlein configuration). (See the Appendix A for a detailed analysis of the Blumlein patent and technology.) However, it should be understood that at very low bass frequencies, RACE loses its effectiveness allowing increasing crosstalk as the frequency declines and therefore amplifying LF phase cues for coincident microphone recordings. This is basically a non-issue. Remember that the ear's ability to localize bass frequencies at 80 Hz and below is virtually non-existent. The pinna certainly has no capability in this frequency range and the head is too small to attenuate signals with wavelengths measured in tens of feet. Thus the only localization method available to the brain at very low frequencies is the few degrees of phase shift between the ears. There is no evidence that the brain can detect such small phase shifts and thus worrying about crosstalk elimination at very low frequencies to improve front stage imaging is not productive. Also, at very low frequencies the power required to produce crosstalk cancellation becomes excessive and since it is not necessary RACE automatically avoids low bass crosstalk cancellation.
Once we know that playback will be Ambiophonic, the question arises as to whether there is an ideal recording method that can take advantage of the fact that surround ambience will be derived via convolution, that the Ambiodipole will eliminate crosstalk and avoid phantom imaging. But I still want to emphasize that although Ambiophone microphone arrangements can make the Ambiophonic approach to realism even more effective, Ambiophonics works quite well with most of the microphone setups used in classical music or audiophile caliber jazz recordings. One can heighten the accuracy, if not gild the lily of realism, of an Ambiophonic reproduction system by taking advantage, in the microphone arrangement, of the knowledge that in playback, the rear/side half of the hall ambience is convolved, that a stage can go out to 180 degrees.
Earlier we considered the binaural model where microphones are inserted in the ear canal of an ideally situated listener. But now the situation is different. We are going to reproduce the hall ambience by convolution so we do not want our binaural listener to pick up any hall ambience from the rear, the extreme sides, or the ceiling. So let us put sound absorbing material just behind his head and above him as well so that he has a sonic view of only the stage in front of him. Now we know that upon reproduction the Ambiodipole speaker sound will pass by his pinna on the way to the eardrum. Thus we do not want any pinna at the recording site. Thus the human listener is excused from the recording site and we are left with a pair of baffled head spaced omni or cardioid microphones sitting at the best seat in the house. But the rule stated earlier said there must be at least one and only one head shadow in the recording/reproduction chain and so, since the home listener is directly in front of the Ambiodipole, it is up to the Ambiophone to provide a head shadow. So let us put a head shaped oval between the two microphones at this best seat in the house. So our Ambiophone boils down to an oval shaped two capsule assembly baffled to the rear and above comfortably ensconced at the best seat in the house or studio.
Nothing New Under the Sun
After completing the above derivation of the ideal Ambiophone, I began to search for recordings that played back realistically Ambiophonically to see if they had anything consistent or unusual about them. Not being a recording engineer or a microphone aficionado, it took me awhile to notice that many of the best CDs in my collection were made with something called a Schoeps KFM-6. A picture of this microphone in a PGM Recordings promotional flyer showed a head sized but spherical ball with two omnidirectional microphones one recessed on each side of the ball where ear canals would be if we had an exactly round head. The PGM flyer also included a reference to a paper by Guenther Theile describing the microphone, entitled On the Naturalness of Two-Channel Stereo Sound, J. Audio Eng. Soc., Vol. 39, No. 10, 1991 OCT. Although Theile would probably object to my characterization of his microphone, his design is essentially a simplified dummy head without external ears. He states, "It is found that simulation of depth and space are lacking when coincident microphone and panpot techniques are applied. To obtain optimum simulation of spatial perspective it is important for two loudspeaker signals to have interaural correlation that is as natural as possible........Music recordings confirm that the sphere microphone combines favorable imaging characteristics with regard to spatial perspective accuracy of localization and sound color....." Later he states "The coincident microphone signal, which does not provide any head-specific interaural signal differences, fails not only in generating a head-referred presentation of the authentic spatial impression and depth, but also in generating a loudspeaker-referred simulation of the spatial impression and depth......it is important that, as far as possible, the two loudspeaker signals contain natural interaural attributes rather than the resultant listener's ear signals in the playback room."
What Theile did not appreciate is that, for signals coming from the side, the sphere acts as sort of filter for the shorter wavelengths just as the head does. When this side sound comes from side stereo speakers the listener's head again acts as a filter resulting in HRTF squared. The solution, of course, is to use the software Ambiodipole and listen to the Theile sphere without the second head response function. Theile also "generates artificial reflections and reverberation from spot-microphone signals." He uses the word artificial in the sense that the spot microphone signals will be coming from the front stereo loudspeakers instead of from the rear, the sides, or overhead. While Theile's results rest as much on empirical subjective opinion as they do on psychoacoustic precepts, they certainly are consistent with the premises of Ambiophonics both in recording and reproduction. Making new recordings using the Schoeps KFM-6 version of the Theile Sphere and evaluating existing recordings made with this microphone show that the theory is correct since such recordings yield exceptionally realistic front stages with normal concert-hall perspectives and proscenium ambience.
Realistic Reproduction of Depth and Perspective
It is axiomatic that a realistic music reproduction system should render depth as accurately as possible. Fortunately, front stage distance cues are easier to record and/or recreate realistically than most other parameters of the concert-hall sound field. Assuming that the recording microphones are placed at a reasonable distance from the front of the stage, then the high frequency roll-off due to distance and the general attenuation of sound with distance remain viable distance cues in the recording. Depth of discrete stage sound sources is, however, more strongly evidenced in concert-halls by the amplitude and delay of the early reflections and the ear finds it easier to sense this depth if there is a diversity of such reflections. The Ambiophonic crosstalk cancellation feature also enhances depth perception since depth perception of close by sources is enhanced when the range of ILD and ITD is greater.
In Ambiophonics, convolved early reflections from the surround speakers make the stage as a whole seem more interesting, but it is only the recorded early reflections coming from the front speakers that provide the reflections that allow depth differentiation between individual instruments. This is why anechoic recordings sound so flat when played back stereophonically or even Ambiophonically, despite the presence of an added ambient field. In ordinary stereo, depth perception will suffer if early side and rear hall reflections wrap around to the front speakers or in the anechoic case, are completely missing. Since it is easy to make Ambiophonic recordings that include just proscenium ambience, why not do so and save on convolver processing power and preserve, undistorted, the depth perception cues?
There remains the issue of perspective, however. When making a live performance recording of an opera or a symphony orchestra the recording microphones are likely to be far enough away from the sound sources to produce an image at home that is not so close as to be claustrophobic. There are many recordings, however, that produce a sense of being at or just behind the conductor's podium. This effect does not necessarily impact realism but you must like to sit in the front row to be comfortable with this perspective. Turning down the volume and adding ambience can compensate for this, but with a loss in realism. This problem becomes more serious in the case of solo piano recordings or small Jazz combos. For example, if a microphone pair is placed three feet from an eight foot piano, then that piano is going to be an overwhelming close-up presence in the listening room and a "They-Are-Here" instead of a "You Are There" effect is unavoidable. This will be very realistic especially with the Ambiodipole, but adding real hall ambience doesn't help much since the direct sound is so overwhelming. The major problem with this type of recording is that you have to like having these people so close in a small home listening room. You may notice that demonstrators of high resolution playback systems in show rooms or at shows, overwhelmingly, use small ensemble, solo guitar, single vocalist etc., close mic'ed, recordings to demonstrate the lifelike qualities of their products and that these demonstrations are mostly of the "They Are Here" variety.
These depth and perspective problems are easily solved by simply placing an Ambiophone at a seat that has a reasonable view of the performers.