Header image  
by Ralph Glasgal
line decor
Home Tutorials Tech
Kudos and
Demos Bio Free Ambio
Glossary The Home
Concert Hall
Rec Engineers
FAQ/Forum Links Contact us
line decor

The Ambiophone
Derivation of a Recording Methodology Optimized For Ambiophonic Reproduction -- 2

By Ralph Glasgal

Presented at the AES 19th International Conference on Surround Sound Techniques, Technology, and Perception
June 21-24, 2001, Schloss Elmau, Germany

For a copy of this article in its entirety, click here
(PDF, must have Acrobat Reader)

Music Versus Movie Recording

In contrast to movies, where the scene and the acoustic venue change every ten seconds and planes fly overhead and the phone rings in the back room, during a concert, musicians normally do not move about the stage, and the acoustics of the auditorium remain stable for the duration of the performance. Furthermore there are no direct sound sources off stage and the stage width seldom subtends more than say 150-degrees as seen from a good seat in the orchestra.

Thus, it makes little sense from an information theory standpoint to make provision for capturing off stage direct sound or recording the hall ambient response over and over for every note that is played. If you measure the equations of the hall at the best seat in the house for several locations on the stage then it is not necessary to measure and record any hall sound during the performance. One measures the impulse response of the hall before or after the recording session, with or without an audience, or uses a library of the impulse responses of the great halls of the world. Then one uses a mathematical process, called convolution, to operate on the direct sound to generate the surround ambience that would be generated by that hall for the music being played.

The Ambiovolver parcels out the early reflections and the reverberant tails to the appropriate speakers in the domestic concert hall. (Fig. 2) The process is scalable and the numbers, locations, and frequency responses, of the surround speakers are not critical. The process of convolution, in contrast to microphones, nicely insures that no direct sound can get into the surround speakers. Recording engineers do not have to worry about the ratio of direct to reverberant sound in the hall or main microphones and the main microphones, as we shall see, can be placed without regard to the hall critical radius or the directionality of the bare microphone. Convolution of a stored impulse response makes it unnecessary to use microphones to record hall sound and also eliminates the need for media (SACD, DVD) surround tracks.

In the case of Ambiophonics, as opposed to 5.1, it is normally up to listeners to decide how great a hall they need to recreate to be satisfied. They may select the hall and the number of ambience speakers. Note that even a poor hall can seem as real as a good hall. However, there is nothing to prevent recording engineers from providing the impulse response of the hall they recommend or stating the address of the hall in the eventual Internet library they wish the listener to use. It is also possible or inevitable that the media player will control the convolver in this regard. The process would then be transparent to the unskilled user.

Another major advantage of hall ambience convolution over trying to record and deliver multichannel surround ambience is that both the locations and the number of the surround speakers in the home system are flexible, scalable, and not critical. It is ludicrous to think that two-surround channels as in 5.1 can emulate a concert hall or provide even marginally acceptable envelopment. Damaske and Ando [6] say five is the bare minimum but bare minimums are not the audiophile way. Research [7] has shown that concert hall listeners are pleased when ambience comes from lateral, rear, overhead and frontal directions in that order of importance and that such ambience should be as uncorrelated or diverse as possible. As a start in this direction, DVD-A can support four ambience channels if 5.1 is not used, and, indeed, Chesky Records and MDG have already released such discs.

I hope shortly to be able to report on experiments with dual membrane electrostatic panels. Such a dual panel can emit both ambience stimulated from the left of the stage and ambience stimulated from the right side of the stage convolved for the same angle saving speaker space and cost. Such membranes are transparent to sound and so do not cause serious, erroneous early reflections. Alternatively, the signals for the same surround angles can be mixed electronically and applied to a single speaker to save on space, speakers and amplifiers.

But there is no known real-time four-surround speaker microphone placement theory that can deliver the kind of realism that audiophiles rightly demand, although Ambisonics comes close. However, Ambisonics requires more than two media channels, does nothing for the existing library of two channel LPs and CDs, and is more sensitive to speaker location and response.

There is the minor question of reproducing surround applause when concerts in front of audiences are recorded. The Ambiophonic method will cause any rearward direct sound picked up by a microphone to come from the front and after convolution to also correctly come from the surround speakers. If this is considered to be a serious defect, then it is possible for record producers to code discs in the future to mute the front channels during applause intervals or whenever only a convolved rearward direct sound effect is desired. Eventually disc codes would also be able to control the Ambiovolver without having to use any additional media channels to steer rear sound effects to specific surround speakers without including hall reverberation.

The Psychoacoustic Basics

It is not necessary to understand precisely how the ear/brain system works or how concert halls work if we simply deliver to the home listener a reasonable replica of what that listener's eardrums would have been exposed to if they were at the live concert recording session. Fig. 2 This, of course, is the basic premise of binaural technology [4] and defines Ambiophonics as a "you-are-there" methodology.

There are no shortcuts. If realism (or biological naturalness is perhaps a better term in this context) is a priority, then you had better insure that in any recording/reproduction chain that there will be at least one and only one set of pinna (your own) [1] and one and only one head shadowing function (which need not be your own).


It is no secret that the traditional stereo triangle encompasses the psychoacoustic defects already mentioned above. Again, these include a reliance on the shaky and non-linear phantom image illusion, a limitation in stage width to the angle between the speakers, confusion of the pinna direction finding mechanism due to comb filtering, [14] erroneous head shadowing for central sources and localization contradictions due to discrepancies between single pinna localization and interaural localization cues. Details on these topics and references are available in the lengthy but free book, downloadable from [1].

To avoid these psychoacoustic pitfalls, Ambiophonics uses two speakers directly in front of the listener to reproduce all front stage sound including any frontal, early reflections and proscenium reverberant tails. The combination of these two speakers and the software that drives them I call an Ambiopole. Fig. 1 An Ambiopole is designed to externalize the binaural effect using loudspeakers. The Ambiopole reproduces only direct stage sound and microphone captured frontal ambience. With the loudspeakers directly in front of the listener and hopefully head spaced there is only negligible head shadowing effect, no pinna angle error for the key central part of the stage, no need for phantom imaging and no HRTF compensation required. Since the signal delivered to the ears is truly normal binaural, there is no serious limitation as to stage width up to about 150 degrees at which point the pinna begin to say the stage is narrow despite the binaural interaural time and intensity cues. [14]

Even if the source material is not acoustic or synthesized using virtual reality methods and panning algorithms, it is still better to reproduce such music without stereo psychoacoustic distortions such as crosstalk and pinna angle error. Such electronic music can also be convolved to set it in a pleasing, lively ambience.

Of course, as is inherent in earphone binaural listening, some means must be used to keep the left and right signals separate at each ear. A straightforward method proposed and tested in [1, 14] was a simple mechanical barrier and for perfectionists this is still a valid way to go. But now we have crosstalk cancelling software and fast DSP algorithms that can do the job in real time. [10, 11] In contrast to earphone binaural no head tracking is required. In contrast to most earlier crosstalk elimination methods, no HRTF filters are required. One is free to move ones head without prejudice just as in a concert hall. One can also get up, and walk about within the circle of surround speakers (hopefully horizontal line sources or panels) but still feel that one is in a hall with a stage up front.

The ideal Ambiopole uses speakers that are line or point sources, that don't spray sound to the floor, rear wall or ceiling causing bogus early reflections, and that if two or three way are time coherent so as to make the crosstalk cancellation more accurate. With the speakers so close together the sweet area is larger than the critics of crosstalk cancellation usually suggest and is comparable to the size of the sweet spot in a high-end stereo system. Experiments I have done with horizontally omnidirectional Ambiopoles enlarge the sweet area enough to allow two people to sit side by side and one can move along the center line eight feet or so without losing the stage which is seldom possible using the stereo triangle or LCR arrangements. I have also successfully used two-meter tall, one-meter wide full range electrostatic panels that are slightly concave. These focus sound at the listening area and behave like collimated sound sources. It is likely that shaped NXT panels may also prove quite practical in this application.

Th Ambiophonics Institute also strongly recommends room treatment and room/speaker correction. DSP based room correctors are now widely available and can correct most speaker responses and eliminate the worst of the bass room modes. At the high frequencies, absorbent room treatment is useful to avoid erroneous early reflections of direct sound. However, the presence of four (or hopefully more) convolved surround ambience speakers mostly swamps the Rt of the room. The small room essentially rereverberates the hall tails adding a few tenths of a second to the convolved reverberation time which is difficult to detect. However, a similar case cannot be made for spurious early room reflections and so room and recording studio treatment remains highly desirable.

Software add-ons to the Ambiopole software can also be used to compensate for the microphone technique used to make the recording such as ORTF, spaced omnis etc. [11] Once you have heard what an Ambiopole, combined with room correction and Ambiovolver surround ambience can do for ordinary LPs, CDs, SACD, or DVDs it is hard to be satisfied with stereo, 5.1, or 7.1 sound reproduction. [11]

The Ambiophone

If you know that the hall ambience for the surround speakers is going to be convolved from a real hall impulse response then it is clear that one only needs to record the direct sound from the stage. We also know that binaural technology obviates the need for a center speaker and in any case, the use of an Ambiopole makes recording such a signal, futile.

Remember the basic precept for "you are there realism". You want to deliver to the ears of the home listener the same sound that he would have heard had he been at the position of the recording microphone during the performance. In the case of a symphony orchestra being recorded in a real concert hall there is no reason not to put the recording microphone at the best seat in the hall, say fifth row center. But what kind of main two-channel microphone are we talking about?

First we do not want this microphone to pick up any sound from the sides, rear or ceiling of the hall because the bulk of the hall ambience will be recovered by convolution and we do not want any off stage early reflections or reverberation tails coming from the front speakers during reproduction. So let us assume that the microphone sitting in the fifth row is reasonably baffled to the rear, sides and overhead with high frequency sound absorbing material. If the microphone pair is baffled this way, then we can use uncolored, high quality omnidirectional microphones to good effect. We can also ignore such parameters as the recording angle and indirect ambient pickup in deciding on placement.

Since this is a binaural technology, the microphones are naturally head spaced. A sound from the center of the stage is picked up and goes from each microphone directly to the corresponding speaker in front of the equivalent ear of the listener. The perspective that the listener will hear is the same as that of the microphone and depth cues remain largely intact. So far we have adhered to the basic rule of Ambiophonics that there be only one set of Pinnae in the chain, the ultimate listener's. Combine Figures 1 and 3.

We observe that in the case of central stage sound there is no erroneous head shadow involved since the sound sources, the microphones and the speakers are front, center and in line so that very little sound is colored by having to go around the head to reach a pinna.

However, when a sound originates from the side of the stage there will be no head shadow function, (or interaural level differences in the case of omnis) and the rule is that there must be at least one and only one. The answer is to mount the two omnidirectional microphones on the surface of a head shaped ball so that as sound from the sides of the stage impinges on the microphone position a head shadow will be produced. Fig. 3 On reproduction the sound will still be coming directly from the front so there will not be a second head shadow introduced. Such a microphone already exists and was conceived by Głnther Theile [15]. It is called the Theile Sphere or the Schoeps KFM-6. There is also no reason why an Ambiophone or KFM-6 using cardioids would not also work. Although their use could exaggerate the head shadow and thus distort the far sides of the stage image this would be an area where recording engineers could continue to express their artistic sensibilities.

While a microphone shaped less like a sphere and more like an average head without outer ears might be preferable, head shadowing is not as critical as pinna function. The reason seems to be that sound passes around the head over the top, around the back, under the chin, past the nose, and in many more different ways as the head is moved. [4] Thus it is largely the delay and attenuation that is significant in head shadowing not a particular individuals pattern of peaks and nulls as in the case of the pinna. Remember, realism not absolutism.

Sounds that come from the extreme sides during reproduction the stage will reach the pinna from the wrong angle. Thus this microphone and speaker technique cannot produce a stage width of a full 180-degrees for live recordings. But up to 150 degrees or so, the naturalness of the stage and the ease of localization are quite apparent. One can observe that at the angles around 70 degrees to the side, the ear canals are not much shadowed by the outer ear and the pinna responses are smoother. [8, 9] The pinna seem to be meant to be somewhat more sensitive to sound from the front and rear middle so, if there is going to be an error in the system, it is better to have it at the front-side extremes, as in Ambiophonics, rather than in the median plane, as in stereo.

The theoretically perfect Ambiophone is thus a baffled two channel pinnaless dummy head microphone placed at the best seat in the house. With multichanneI DVD-A or SACD it would be theoretically, if not legally, possible to record with Ambiophones placed at say the fifth, tenth, and twentieth rows so that the home listener can have a choice of seats in the hall. (Figure 3) Ideally the impulse responses from these same locations would be available to recreate the precise ambience.

I appreciate that stereo recording engineers are hopelessly addicted to spot microphones. However, in Ambiophonics it is difficult to mix spots into the two main channels, and still maintain the same fifth row center perspective and include the required head shadow. In the Ambiophonic recordings produced so far, the use of a spot microphone in one of them was painfully obvious. In the case of studio recordings of relatively small pop ensembles the use of spots would make less difference. As discussed below, the use of Ambiophonic studio monitoring systems would likely make the demand for spot microphone use by musicians much less likely.

An Ambiophonic spot microphone method has been proposed by Studer and Prof. Angelo Farina. One measures the impulse response of the stage at each point where a directional spot microphone is to be placed. This is done by launching a test signal from the stage to the Ambiophone and measuring the two-eared impulse response. After the recording is made the spot microphone signal can be convolved with each impulse response and added in to the appropriate left and right main Ambiophone channels.

I should observe however, that for concert hall realism this added complexity should never be needed. If the conductor thinks an instrument is not prominent enough during a concert or recording session he/she has presumably done something about it by moving the instrument forward or using risers. One could also elevate the Ambiophone to get a better view of the stage than most of the audience has. In general, trying to improve on concert hall practice is likely to please only some of the listeners some of the time.

As is the case in Ambisonic recordings made with a Soundfield microphone placed at the best seat in the house, an orchestral concert recorded by Robin Miller of Filmaker Inc. from a 10th row seat in the hall (well beyond the critical radius) using an Ambiophone has a normal stage perspective. Both Ambisonic and Ambiophonic recordings demonstrate that close up main and spot microphones, thereby always getting a conductor's view are not necessary although it may be preferred in some situations.

Monitoring Ambiophonic Setups

Even if you only want to monitor stereo or 5.1 recording and processing setups, an Ambiophonic layout can be advantageous. The most efficacious Ambiopole arrangement is the use a mechanical [14] version of the stereo dipole. Fig. 4 This is simply two very small high-quality monitors such as satellite speakers placed against either side of a three-foot square one inch thick wooden panel. When you listen at the far edge of this panel to the front stage sound it is very easy to hear what the perspective is, what the stage width seems to be and any other anomalies such as echoes etc. The clarity of such a monitoring arrangement makes it easier for musicians to judge the quality of a trial playback. In the limited experience we have had so far with this, the performers have been quite pleased and asked for fewer changes than is normally the case in such situations.

Surround speakers should always be on and used when monitoring the live microphone setup or a trial playback. This is a good practice even if the recording is stereo, 5.1 or Ambiophonic. A simple four or six-channel hall surround convolver, driving small speakers in the control room, makes judging the quality of the sound being captured much easier to appreciate and facilitates getting a stamp of approval for both musical and engineering efforts. Fig. 4

I believe once recording engineers become familiar with the use of Ambiovolvers that they will find it much more realistic, trouble free, and cost effective to derive the surround signals for standard 5.1 or 6.0 recordings from the impulse response of the hall than to site microphones and hope to capture hall sound during the live session. Once the classical music public becomes used to programming their own halls at home, 2 channel live concert and opera recording engineers will have much less to fret about.

<< Previous Page | Next Page >>

^ Back to Top ^