Header image  
by Ralph Glasgal
line decor
Home Tutorials Tech
Kudos and
Demos Bio Free Ambio
Glossary The Home
Concert Hall
Rec Engineers
FAQ/Forum Links Contact us
line decor

Ambiophonics, 2nd Edition
Replacing Stereophonics to Achieve Concert-Hall Realism

By Ralph Glasgal, Founder Ambiophonics Institute, Rockleigh, New Jersey www.ambiophonics.org

Chapter 7

Ambience Convolution

One precept of Ambiophonics is that for music one should be sure to surround ordinary two channel discs with a fully directional reverberant field.  For existing recordings, one uses the techniques described below to produce hall sounds for surround speakers.  The Ambiophonic alternative is to use the Ambiophone (described elsewhere) to record the rear half of the hall and then feed this to a rear Ambiodipole so as to generate the rear half circle of hall sound.  But this method requires four channel media, new recordings, a special microphone and does nothing for the huge existing library of CDs, LPs, etc.

Again, one of the main precepts of Ambiophonic theory is that where music recording is concerned, it is counter productive to record concert hall ambience during a recording session using microphones and then waste DVD/SACD/MP3/.wav/5.1 bandwidth delivering this defective ambience to the home listener. To understand why this is so, it is necessary to review what we know about how concert halls, opera houses, recital halls, churches, recording studios and rock pavilions operate. A concert hall, theater, or other auditorium is essentially an analog computer. What this hall computer does is operate on (convolve) each ray of direct sound originating on the stage to transform it in amplitude, frequency response, and direction before delivering it to a given seat in the audience area as hall ambience. (In a good hall, without obstructions, we can assume that the original direct sound reaches most of the seats without passing through the analog computer of the hall.) If we consider every seat in the hall, the number of such equations is almost infinitely large but for our purposes we can assume that we are only interested in what this computer is delivering to one or two of the best seats in the house from the left, right and central areas of the stage.

If we now put a measuring device at this best seat and launch a series of test signals from say three positions on the stage it should be possible to determine the most significant equations used by this concert hall computer to deliver ambient sound to this area. Indeed, this is not only possible but can now be done with such finesse that it obsoletes every other method of recording or delivering surround sound for music to the home listener.

The equations that a hall uses to deliver sound to an audience are usually invariant for the duration of not only that performance but over the lifetime of the hall barring serious renovations. Once the equations of a hall are known, there is little point in measuring them every time that space is used to make a recording. We ignore here the slight variations in hall responses depending on the size of the audience present when the hall is measured. I should add that the latest methods of measuring hall responses make it possible to measure halls with the audience there without making them too uncomfortable or straining their patience. There are some who believe that hall impulse responses will soon be measurable while a concert is in progress.  Unfortunately in the case of movies where the scene changes frequently, this method of surround sound generation is not feasible.  However there is a viable alternative to consider later.

Why Recording Hall Ambience Directly For Surround Speakers Using Microphones Is Not Possible to High-Fidelity Standards

In a concert hall, early reflections and reverberation tails reach a listener from all directions. But in good halls this ambience is not the same in all directions. That is, there is a strong interaural directional component present that interacts with the shadowing function of the head and the pinna structures to allow the hall to be appreciated in all its glory by concertgoers. At home it is necessary to deliver as many of these hall elements as possible without compromise as to these directional ambience components. If the direction from which hall sound comes were not important then reverberation could simply be fed into the front stereo speakers and no surround ambience speakers would be required. But after seventy years of the stereo triangle era, it is clear that doing this can never sound realistic.

Of course, it is laughable to think that the two or even three surround speakers of the 5.1/6.1 Dolby/DTS/Bluray arrangement could deliver a reasonable replica of what a concert hall does. But even if we ignore this issue for the moment, how do we get the 5.1 signals required to drive the two surround speakers or the three if a centered rear speaker is used? The recording engineer needs to set up two or three microphones in the hall for the express purpose of generating signals for these speakers. But where in the hall should he place these extra microphones? Answer comes there none. But worse than this ad hoc decision is the fact that most microphones are not very directional. Thus if a pressure microphone is used it will, say, pick up all the early reflections and reverberation tails coming from the ceiling the sides and the rear and lump them all together to later come out of a surround speaker whose location at home and radiation pattern is anybody's guess. Cardioids and velocity microphones are more directional but which way should they point? Invariably, proscenium early reflections will end up coming from the side or even worse the rear and ceiling ambience will be arriving from ear level, etc. Mixing several mic's together does not solve the problem. Of course, many surround tracks are made without benefit of any microphones (using the Lex in record producing parlance) because of these and cost problems. We will see below that before too much longer the virtues of deriving the surround channels from hall impulse measurements rather than microphones will be quite apparent to all music. if not video, recording engineers. Another issue is that this ambience, being recorded willy-nilly somewhere in the hall, does not represent the reverb one would be hearing at the best seat in the hall or indeed any seat unless the ambience microphones are all quite close together about that seat.

Some Ancient History for Skeptics

Once one decides that hall ambience is indeed needed to perfect the reproduction of 2 channel (or 5.1 for that matter) recordings so as to produce a "you are in a concert hall" experience, there are only two ways to go. One is to pick a fine concert hall, construct a model of it at home and put two loudspeakers on its stage. That this technique does work was demonstrated conclusively several times in Carnegie Hall and Carnegie Recital Hall in the 1950's by Gilbert Briggs of Wharfedale Loudspeakers, and most notably by Ed Vilchur, the founder of Acoustic Research. I attended live-versus-recorded presentations by both these gentlemen in New York and not only could I not tell when the live musicians ceased playing and the recording took over, but almost on one else in the sold-out house could either, judging from the gasps and buzz in the audience when the string quartet players finally put down their bows and the music played on. The fact that such an illusion could be created with low-powered vacuum-tube amplifiers and excellent but still relatively primitive loudspeakers, should have tipped us off to the fact that ambience is essentially everything, and equipment quality relatively insignificant where realism is concerned.

It is possible, if impractical, to construct a smallish room that would closely mimic the ambience of Carnegie Hall, at least in the central listening area. The use of modern diffusers, absorbers, and ceiling and floor treatments could produce the reverberation time, reverberant-field frequency response and even the early reflection pattern of any good concert hall. It would then be possible to play recordings in such a room to excellent effect. The advantages of this approach include the fact that such a room would also be excellent for live music soirees as well.

The disadvantages of this approach, for the reproduction of recorded music, are several and instructive. The costs of designing, constructing and tuning such a room are beyond the reach of those of us not direct descendants of Andrew Carnegie. One would also lose the flexibility of being in other acoustic settings such as churches or recital halls. Both Briggs and Vilchur used their own recordings, carefully made to avoid any recording-site hall coloration. Finally the problem of stereo signal crosstalk would remain for most listening positions. In the Briggs Carnegie Hall demonstration, (which, I believe used mono recordings) most listeners in this very large hall were exposed mainly to the reverberant field and their visual senses substituted for any missing or weak directional sound cues.

Characteristics of an Ambient Field

Basically, the only things you can do to a sound wave, launched in an enclosed space, are attenuate it, usually as a function of frequency, or change its direction. Absorption is a form of extreme attenuation. But sound loses intensity merely by traveling a distance through air. A characteristic of attenuation is that it is almost always frequency sensitive, with higher frequencies usually rolling off more than lower frequencies, in air, with distance, or in sound absorbing material. Sound changes direction whenever it encounters an obstruction-usually by reflection as light does (specular reflection), or by diffraction, which is a process by which sound waves sort of ooze around obstacles. As in attenuation, reflection and diffraction are frequency sensitive, with higher frequencies usually being easier to steer or control.Thus every space, but especially a concert hall, can be described acoustically in terms of its attenuation characteristics and its three-dimensional reflectivity pattern as a function of frequency, direct sound-source position, time, and listener-seat location. Our problem is then to either measure these functions in the real halls we like and recreate them via surround speakers in our listening room or design a pleasing but entirely new hall in software that may not exist physically. Both of these approaches are possible using the early JVC or SONY hardware convolvers or the software methods discussed below. It is also always possible to start with a real hall and modify it to taste as you listen to your favorite music.

We need to be able to create any kind of acoustical signature we like within our treated listening room. We have to be smart enough to invent a hall ambience processor that can generate any field, we or the recording engineer want. There is no reasonable alternative to using a special-purpose computer to generate the early reflections and reverberation trains. The only major issue still to be resolved is who should control or own the convolver: the record producer, or the home audiophile. But we need more technical background to decide this issue.

Early Reflection Parameters

To produce a realistic group of early reflections, a computer or digital signal processor needs to recreate and vary the following parameters separately for the left and right stage sounds. These items determine how big the hall is, what its shape is (such as rectangular, fan or low ceilinged), how large the proscenium is, etc.

  • The delay between the direct sound and the arrival of its first reflection
  • The delay of the second and subsequent early reflections and their density
  • The frequency response of these discrete early reflections
  • The initial amplitude and rate of amplitude loss for the subsequent reflections of these very early reflections
  • The source of each reflection: front, side, rear, left, right, up, down, etc.

Normally these parameters are measured in real halls, churches and opera houses and then stored in memory. If the stored reflection patterns are not pleasing, then they can always be modified to taste. Tweaking such parameters can be a lifetime occupation, as it is with some famous concert halls that are forever being tinkered with.

Reverberation Tail Parameters

After the early reflections become so dense and weakened that the ear is no longer sensitive to their individual arrival times, the reverberant characteristics of the space become evident. The reverberant parameters that need to be recreated by a convolver separately for the left and right signals include:

  • Reverberation decay envelope for high frequencies
  • Reverberation decay envelope for low frequencies
  • Frequency responses for the front, side, rear, overhead, etc. tails with time
  • Density of the reverberant field
  • Directional characteristics of the reverberant tails

If early reflections persist for a relatively long time before the reverberant field begins, then the space will be perceived as live and possibly large. If the reverberation time is long then the hall will seem live, or if very long, cathedral-like. High-frequency rolloff in the reverberant field also makes the hall seem larger. The directional distribution of the reflections and the reverberant echoes help listeners determine the shape of the space and their position in it.

Again, rather than attempt to program all this from theoretical scratch, it is more practical and likely desirable to measure several good existing halls and store the results. The Japanese, and JVC, Yamaha, and Sony in particular were the pioneers in doing just this. The JVC XP-A1010 Digital Acoustics Processor, circa 1989, seemingly the first really commercially produced convolver, (abandoned in haste when 5.1 movie surround sound took over) stored within its memory the key parameters of fifteen actual halls including six symphony halls of various shapes and sizes, an opera house, a recital hall, a church, a cathedral, two jazz clubs, a gymnasium, a rock pavilion, and a stadium. The Sony professional convolver was the first of a later generation to appear. Sony produced four CD-ROMs each storing some eight impulse responses of the great halls and other enclosed spaces of Europe, Japan and America.

Impulse responses and convolution are techniques that have been proven indispensable in designing new halls that work the first time a note is played in them. The new concert hall of the Tokyo Opera City was designed using computer simulations and a one tenth scale model that allowed Leo Beranek and Takahiko Yanagisawa to hear what the hall would sound like before it was built. They could hear how the sound changes with the location of a seat in the hall, or with the addition of a diffusion cloud, or changes in the shape of the hall, etc. Such hall characteristics as intimacy, clarity, spaciousness, bass ratio, could then be adjusted to match the characteristics found desirable in existing great halls.

However, audiophiles can have an advantage that architects can only dream about. Our halls are not cast in diffuser wood. For if great halls can be simulated to such perfection using convolvers and auralization then why build the hall physically? The hall we simulate on our home computer should sound every bit as good as the one being constructed or better since we can vary our at-home halls to better suit the music being played or just to suit our mood. Perhaps we can even make a hall within our home that sounds better than any Leo Beranek could convolve and then construct.

Adjusting Ambience Parameters for Ambiophonic Listening

To play a recording Ambiophonically, using a convolver, one first consults the recording booklet or jacket to see what acoustic space it was recorded in. Was it a studio, a church, a concert hall, an opera house, a recital hall, a theater, etc. Good recordings include frontal proscenium early reflections and reverberation that naturally should come from the front main speakers. Therefore for best results it is desirable to select that hall if it is in your library or use a hall that sounds as much like the recorded hall as possible.You can do this quickly with a little practice by listening to the main front channels with the surrounds switched off, and estimating the reverberation time of the hall, which in most concert halls or opera houses is from one-and-one half to three seconds. Then estimate other hall characteristics such as liveness, and capacity. You then select the stored hall that best matches your research or assumptions. You can also program your guesses directly, bringing up the surround speaker volumes one at a time to the levels that sound most realistic. Such settings can, of course be stored and recalled at any time.

Convolvers can also be told to compensate for the fact that, some of the time, recorded hall reverberation is being re-reverberated and that some rear ambience is coming from the front speakers. When I first started experimenting with the Ambiophonic method I thought this erroneous reverb might be a serious drawback as far as playing existing recordings was concerned. However, it is easy to see why this is not the case. The small amount of extra rear reverb coming from the main front speakers is quite overwhelmed by the ambience from the hopefully many surround speakers. Also, it is not unusual for a physical hall to re-reflect rear ambience from the proscenium. All that this extra frontal reverb means is that the hall is a little bit livelier than the impulse response suggests. Since this is an easily adjusted parameter it can be corrected for if anyone really hears this effect.

If the recording has reverb mixed into the direct sound, as most recordings do, the convolver will convolve this ambience as if it were a direct sound signal, generating additional ambience. What does this really mean however? It simply means the convolved hall now has a longer reverberation time than we meant to set and that the decay at the end of the tail is not as steep. In physical terms it means the hall has had an additional diffusion cloud installed. This is also an easily corrected condition but, even if left uncompensated for, it seldom is audible even by golden eared audiophiles. The convolver adjustment process becomes instinctive after a while and usually takes less than a minute. Compulsive tweakers could, of course, make ambience parameter adjustment their life's work as there are numerous ways to control volume, delay, hall type, decay and frequency response characteristics for each surround speaker individually and each direct sound channel of which there are hopefully only two or four as in Panambio. The saving grace, which prevents tweak insanity is that once the ambience sounds real and reasonably suits the music and the recording, maybe it can still be improved, but real is real. I have found that minor adjustments seem to change only my perceived position in the hall.

Someday Ambiophonic recordings for the audiophile market will be made without significant recorded rear hall sound, the recommended hall parameters will be printed on the label and the CD or DVD will contain coding to automatically operate the convolver.  As part of the research for this book, I listened to hundreds of recordings, both LP and CD. To paraphrase Will Rogers, I never met a classical recording (jazz is too easy) I couldn't work wonders with. The most exciting discovery was that monophonic LPs (or CD versions) even from the 20's could be made to sound exceptionally realistic in an Ambiophonic room. The reason for this seems to be that many early mono recordings, particularly acoustics, have very little recorded room reverberation, making it easier to create a realistic sound field to place them into. Also, the absence of a stereo effect in the presence of well-tailored hall ambience tells the ear/brain system that the source is distant. Thus, for large mono ensemble sound sources the listener appears to be in the balcony of a large hall-but balcony or not, real is real.

Because of the cocktail party effect, needle scratch or frequency-response aberrations become minor distractions, and Caruso, Toscannini, or Melchior never sounded so thrilling or three-dimensional before-and the Caruso recordings are over 100 years old.

Measuring Real Concert Hall Ambient Fields

Only three convolvers worthy of the name have ever been commercially available. All are Japanese, one from JVC, one from Sony and one from Yamaha. Although the JVC unit (like the others) is no longer available its technology is still of paramount importance. A group of researchers in 1987 at the Victor Company of Japan (JVC) headed by Yoshio Yamazaki and including Hideki Tachibana, Masayuki Morimato. Yoshio Hirasawa, and Junichi Maekawa, developed what they called a symmetrical Six-point Sound Field Analysis Method for measuring the acoustic characteristics of a concert hall. In their measurement method, an array of six microphones is placed at a good seat in the hall and a series of test impulses is launched from one or more points on the stage.  All six microphones are omnidirectional and are arranged in three pairs. The microphones in each pair are spaced about six inches apart. One pair of microphones straddles the mounting pole horizontally, left to right, one mounts front to back in the same plane and one pair sits up and down. The center points or origins of each microphone pair are coincident.The impulse, or test patterns launched from the front stage, that each of these microphones hears, then goes to a computer which produces a list of all the discrete early reflections detected by the array, including their time of arrival, their amplitude and their direction of origin.

That such an array can detect all this information is not too hard to understand. For example, any impulse coming from center rear will hit the vertical pair of microphones and the left-right pair of horizontal microphones simultaneously. The front to-back pair will experience the maximum possible back-to-front delay of .4 milliseconds. Thus when the computer detects such a situation it records that a center rear reflection has been received. Likewise a direct impulse from overhead will only produce a time delay in the vertical pair of microphones and a reflection from the side will only show delay in the left-to-right pair. No matter what angle a reflection arrives from, its amplitude and direction can be computed and stored.

In a real concert hall many reflections may be arriving simultaneously, so how did the gentlemen from Japan sort them out? First, each reflection of say a particular impulse generates a signal in all six microphones. All six signals, attributable to a single source, will have essentially the same peak amplitude since the microphones are so close together. Thus any unequal peaks indicate a collision of two or more reflections. Second, the times it takes for a sound to go from one microphone of a pair past the mounting pole to the other microphone of the same pair are identical for all the pairs. Thus all three-microphone pairs should record peaks that are symmetrical in time about the same origin, but with three different spacings depending on the angle of travel. Thus unequal delay to and from the origin indicates an impulse collision. Finally, the ratios of these three delays define the angle to the reflection source, and it happens that for such an orthogonal array, the sum of the three cosines squared of the angle to the impulse source to each axis will add up to one. These three characteristics of the impulses detected by the microphone array represent three simultaneous equations which, when solved, allow a computer to distinguish between two or even three simultaneous or very closely arriving reflections. Since this measuring technique is relatively portable, the JVC team was able to make accurate measurements of halls like the large and small Concertgebouw of Amsterdam, the Alte Oper in Frankfurt, the Beethovenhalle in Bonn, the Philharmoniehalle in Munich, the Staatsoper in Vienna and the Koln Cathedral.

Unfortunately all this brilliant pioneering effort was abruptly subverted when surround sound video systems became the preoccupation of the Japanese establishment. However, JVC did make a few hundred convolvers before the ax fell and these proved that every recording engineer could and should have such an array and PC at any recording session. The engineer could then pick the best listening seat for the array, measure the hall response and later, enter the stored results directly onto a CD or DVD for later loading into the home ambience convolver, probably a PC of some type with lots of DSP power. See Chapter 9 for a discussion of Ambiophones and Ambiophonic recording suggestions.

Sony Decides a Convolver Is Essential If Surround SACD Is to Flourish

Both the DVD-A and Sony's competing format, SACD, are very high resolution, music only, formats seemingly attractive to only the high-end audiophile market. With the addition of multichannel surround capability, however, a wider, more lucrative, audience could be found for these video-less technologies. It is thus clear to everybody in the industry that the future of both systems depends on being able to provide music in a multichannel surround/ambient format. Apparently, Sony decided that unless they provided a means for the industry to make surround music recordings with the same high quality as the SACD disc itself that their investment would be lost. Their problem remained, however, as indicated earlier, that no one knows how to make music surround recordings using microphones that sound realistic or pleasing enough to attract a mass market or even a niche audiophile market segment. The DVD-A group always assumed that Ambisonics would fill this requirement but Sony decided on a more realistic solution: the Sampling Digital Reverberator, DRE-S777.

A Rose by Any Other Name Is Still a Convolver

The Sony DRE-S777 was not made for home audiophile consumers. It was not a sampling digital reverberator; it was a stored hall convolver. Sony Electronics, Inc., Broadcast and Professional Company made it, for professional recording engineers. It was not user friendly. It was Sony's position, that hall convolution should be the province of the SACD producer and not the home listener. The idea was that the recording engineer should just make the best two channel stereo recording he could and then fabricate as many surround channels as he felt was desirable using the stored halls in the DRE-S777. Superficially, this seems like a good idea. It spares the recording engineer the onerous and expensive burden of placing ambient microphones in a hall about which he or she knows very little and for which there is no basis in the mathematics of acoustics for doing so. With a DRE-S777, after the session is over, the producer can go back to his studio and try out different hall ambience combinations and generate as many surround channels as the standard will allow.

Since, as of this writing, the only market is for 5.1 speaker arrangements, he is unlikely to configure more than two channels for surround speakers. Of course, if you don't like the hall the producer has picked or you want more than two surround ambience speakers, Sony was not interested in your problems. The advantage of having the convolver under listener, rather than engineer control, is that since the producer doesn't have to waste DVD/SACD bandwidth on ambience, he can provide direct sound for additional rear and side speakers where the composer has sanctioned such a practice. Indeed such rear or side direct sound channels can share in the ambience of the front stage since the convolver can easily accommodate such an option.The DRE-S777 was priced at five figures and so was not affordable by most home listeners. Sony produced four CD-ROMS containing the impulse responses of great halls and churches in Europe, Japan and America. One DRE could output four surround channels in real time. That is, it could convolve the left input to produce two ambient surround signals and the right channel to produce yet two more different ambient surround signals. Sony used digital signal processing chips that could process 256,000 events in the life of each input music sample. This was long enough to handle the reverberation of even the largest cathedrals.

Four surround channels are nice but eight or more is even better. The sound from four DRE-S777 reproducing a symphony orchestra embraced in the ambience of the Konzerthaus, Berlin, via 16 surround speakers is overwhelming.  But now with modern PC processors a single PC can convolve ambience signals for sixteen or more surround speakers.

Sony's Impulse Response Measuring Method

The usual way to measure impulse responses is to put a relatively small group of microphones at a desirable location and then aim pulses at them from various positions on the stage. In Ambisonics a coincident microphone with one omnidirectional microphone and three collocated figure eight microphones is used. However, the extraction of the ambient data using the Ambisonic approach is quite difficult compared to the six-microphone method used by JVC described above. The SONY approach was clearly related to the fact that it was the professional recording division that had been involved in this development. A preoccupation with the necessities of the 5.1 speaker arrangement was clearly in evidence. Sony used up to ten fairly widely spaced microphones to record impulse test patterns from left, right and center stage speakers. Not all ten microphones were used in every hall but when all are present five of the microphones are omnidirectional and five are Cardioids. The omnis form a rectangle with one of their number in the center. The rectangles vary with the hall but are typically 18 feet wide by 15 feet deep. While there does not seem to be any mathematical foundation for this arrangement, one can put surround speakers at the same positions around the home listening position and the loudspeaker will output the same ambience toward the listener from this location that the microphone picked up.

Unfortunately, a lot of directional information is theoretically lost using this technique compared to the JVC method. For example an early reflection coming from the rear in the hall will be aimed to the listening position from the rear/side 45-degree direction. Perhaps for this reason there is another rectangular array of five cardioid microphones. Cardioids are directional to the extent that they pick up mostly from the half sphere they face. The cardioids are arranged in a 5.1 rectangular pattern with three in front and two at the rear side corners. The cardioids are aimed at the four corners of the halls with the fifth one pointing directly front. They form a rectangle about 9 feet wide by 4 feet front to back. In this case, if a speaker is placed at say 45 degrees to the left side of the home listener and is fed the ambience picked up by this microphone from the left front hall arc, the directionality of the ambient field will be reasonably accurate. For best results one should convolve this ambience response with both the right and the left stage signals and perhaps use two speakers at the left front location or mix them together if this is more convenient. It is not clear to me why the spacing between the microphones used was so large. This appears to be a habit related to the way recording engineers have been trying to record ambience in the last few years. But with ten microphone locations to choose from it is hard to go too far wrong especially if you can afford to use all of them. Variety is the spice of ambience where concert halls are concerned. One can also argue that the ambient field at the center of either of these rectangles is probably not much different from the field near the edges since the halls are so large in comparison to the rectangles.

Some Noise Is Good Noise

When you are in a concert hall or church and the music stops, you are still in a concert hall. Even with your eyes closed you can sense a sort of ambient ambience, a murmur or acoustic dither that even without an audience present tells you what kind of acoustic space you occupy. By contrast, in the Ambiophonic hall, when the music stops you are abruptly transported from a lively exciting space to a rather dead, sounding listening room. For CDs with many silent bands between short selections, this effect can be somewhat disconcerting. Perhaps in the future, recording engineers will avoid such quiet periods.