Music Versus Movie Recording
In contrast to movies, where the scene and the acoustic
venue change every ten seconds and planes fly overhead and the phone
rings in the back room, during a concert, musicians normally do not
move about the stage, and the acoustics of the auditorium remain stable
for the duration of the performance. Furthermore there are no direct
sound sources off stage and the stage width seldom subtends more than
say 150-degrees as seen from a good seat in the orchestra.
Thus, it makes little sense from an information theory
standpoint to make provision for capturing off stage direct sound or
recording the hall ambient response over and over for every note that
is played. If you measure the equations of the hall at the best seat
in the house for several locations on the stage then it is not necessary
to measure and record any hall sound during the performance. One measures
the impulse response of the hall before or after the recording session,
with or without an audience, or uses a library of the impulse responses
of the great halls of the world. Then one uses a mathematical process,
called convolution, to operate on the direct sound to generate the surround
ambience that would be generated by that hall for the music being played.
The Ambiovolver parcels out the early reflections and
the reverberant tails to the appropriate speakers in the domestic concert
hall. (Fig. 2) The process
is scalable and the numbers, locations, and frequency responses, of
the surround speakers are not critical. The process of convolution,
in contrast to microphones, nicely insures that no direct sound can
get into the surround speakers. Recording engineers do not have to worry
about the ratio of direct to reverberant sound in the hall or main microphones
and the main microphones, as we shall see, can be placed without regard
to the hall critical radius or the directionality of the bare microphone.
Convolution of a stored impulse response makes it unnecessary to use
microphones to record hall sound and also eliminates the need for media
(SACD, DVD) surround tracks.
In the case of Ambiophonics, as opposed to 5.1, it is
normally up to listeners to decide how great a hall they need to recreate
to be satisfied. They may select the hall and the number of ambience
speakers. Note that even a poor hall can seem as real as a good hall.
However, there is nothing to prevent recording engineers from providing
the impulse response of the hall they recommend or stating the address
of the hall in the eventual Internet library they wish the listener
to use. It is also possible or inevitable that the media player will
control the convolver in this regard. The process would then be transparent
to the unskilled user.
Another major advantage of hall ambience convolution over
trying to record and deliver multichannel surround ambience is that
both the locations and the number of the surround speakers in the home
system are flexible, scalable, and not critical. It is ludicrous to
think that two-surround channels as in 5.1 can emulate a concert hall
or provide even marginally acceptable envelopment. Damaske and Ando
[6] say five is the bare
minimum but bare minimums are not the audiophile way. Research [7]
has shown that concert hall listeners are pleased when ambience comes
from lateral, rear, overhead and frontal directions in that order of
importance and that such ambience should be as uncorrelated or diverse
as possible. As a start in this direction, DVD-A can support four ambience
channels if 5.1 is not used, and, indeed, Chesky Records and MDG have
already released such discs.
I hope shortly to be able to report on experiments with
dual membrane electrostatic panels. Such a dual panel can emit both
ambience stimulated from the left of the stage and ambience stimulated
from the right side of the stage convolved for the same angle saving
speaker space and cost. Such membranes are transparent to sound and
so do not cause serious, erroneous early reflections. Alternatively,
the signals for the same surround angles can be mixed electronically
and applied to a single speaker to save on space, speakers and amplifiers.
But there is no known real-time four-surround speaker
microphone placement theory that can deliver the kind of realism that
audiophiles rightly demand, although Ambisonics comes close. However,
Ambisonics requires more than two media channels, does nothing for the
existing library of two channel LPs and CDs, and is more sensitive to
speaker location and response.
There is the minor question of reproducing surround applause
when concerts in front of audiences are recorded. The Ambiophonic method
will cause any rearward direct sound picked up by a microphone to come
from the front and after convolution to also correctly come from the
surround speakers. If this is considered to be a serious defect, then
it is possible for record producers to code discs in the future to mute
the front channels during applause intervals or whenever only a convolved
rearward direct sound effect is desired. Eventually disc codes would
also be able to control the Ambiovolver without having to use any additional
media channels to steer rear sound effects to specific surround speakers
without including hall reverberation.
The Psychoacoustic Basics
It is not necessary to understand precisely how the ear/brain
system works or how concert halls work if we simply deliver to the home
listener a reasonable replica of what that listener's eardrums would
have been exposed to if they were at the live concert recording session.
Fig. 2 This, of course,
is the basic premise of binaural technology [4]
and defines Ambiophonics as a "you-are-there" methodology.
There are no shortcuts. If realism (or biological naturalness
is perhaps a better term in this context) is a priority, then you had
better insure that in any recording/reproduction chain that there will
be at least one and only one set of pinna (your own) [1]
and one and only one head shadowing function (which need not be your
own).
Ambiopoles
It is no secret that the traditional stereo triangle encompasses
the psychoacoustic defects already mentioned above. Again, these include
a reliance on the shaky and non-linear phantom image illusion, a limitation
in stage width to the angle between the speakers, confusion of the pinna
direction finding mechanism due to comb filtering, [14]
erroneous head shadowing for central sources and localization contradictions
due to discrepancies between single pinna localization and interaural
localization cues. Details on these topics and references are available
in the lengthy but free book, downloadable from [1].
To avoid these psychoacoustic pitfalls, Ambiophonics uses
two speakers directly in front of the listener to reproduce all front
stage sound including any frontal, early reflections and proscenium
reverberant tails. The combination of these two speakers and the software
that drives them I call an Ambiopole.
Fig. 1 An Ambiopole is designed to externalize the binaural effect
using loudspeakers. The Ambiopole reproduces only direct stage sound
and microphone captured frontal ambience. With the loudspeakers directly
in front of the listener and hopefully head spaced there is only negligible
head shadowing effect, no pinna angle error for the key central part
of the stage, no need for phantom imaging and no HRTF compensation required.
Since the signal delivered to the ears is truly normal binaural, there
is no serious limitation as to stage width up to about 150 degrees at
which point the pinna begin to say the stage is narrow despite the binaural
interaural time and intensity cues. [14]
Even if the source material is not acoustic or synthesized
using virtual reality methods and panning algorithms, it is still better
to reproduce such music without stereo psychoacoustic distortions such
as crosstalk and pinna angle error. Such electronic music can also be
convolved to set it in a pleasing, lively ambience.
Of course, as is inherent in earphone binaural listening,
some means must be used to keep the left and right signals separate
at each ear. A straightforward method proposed and tested in [1,
14] was a simple mechanical
barrier and for perfectionists this is still a valid way to go. But
now we have crosstalk cancelling software and fast DSP algorithms that
can do the job in real time. [10,
11] In contrast to earphone
binaural no head tracking is required. In contrast to most earlier crosstalk
elimination methods, no HRTF filters are required. One is free to move
ones head without prejudice just as in a concert hall. One can also
get up, and walk about within the circle of surround speakers (hopefully
horizontal line sources or panels) but still feel that one is in a hall
with a stage up front.
The ideal Ambiopole uses speakers that are line or point
sources, that don't spray sound to the floor, rear wall or ceiling causing
bogus early reflections, and that if two or three way are time coherent
so as to make the crosstalk cancellation more accurate. With the speakers
so close together the sweet area is larger than the critics of crosstalk
cancellation usually suggest and is comparable to the size of the sweet
spot in a high-end stereo system. Experiments I have done with horizontally
omnidirectional Ambiopoles enlarge the sweet area enough to allow two
people to sit side by side and one can move along the center line eight
feet or so without losing the stage which is seldom possible using the
stereo triangle or LCR arrangements. I have also successfully used two-meter
tall, one-meter wide full range electrostatic panels that are slightly
concave. These focus sound at the listening area and behave like collimated
sound sources. It is likely that shaped NXT panels may also prove quite
practical in this application.
Th Ambiophonics Institute also strongly recommends room
treatment and room/speaker correction. DSP based room correctors are
now widely available and can correct most speaker responses and eliminate
the worst of the bass room modes. At the high frequencies, absorbent
room treatment is useful to avoid erroneous early reflections of direct
sound. However, the presence of four (or hopefully more) convolved surround
ambience speakers mostly swamps the Rt of the room. The small room essentially
rereverberates the hall tails adding a few tenths of a second to the
convolved reverberation time which is difficult to detect. However,
a similar case cannot be made for spurious early room reflections and
so room and recording studio treatment remains highly desirable.
Software add-ons to the Ambiopole software can also be
used to compensate for the microphone technique used to make the recording
such as ORTF, spaced omnis etc. [11]
Once you have heard what an Ambiopole, combined with room correction
and Ambiovolver surround ambience can do for ordinary LPs, CDs, SACD,
or DVDs it is hard to be satisfied with stereo, 5.1, or 7.1 sound reproduction.
[11]
The Ambiophone
If you know that the hall ambience for the surround speakers
is going to be convolved from a real hall impulse response then it is
clear that one only needs to record the direct sound from the stage.
We also know that binaural technology obviates the need for a center
speaker and in any case, the use of an Ambiopole makes recording such
a signal, futile.
Remember the basic precept for "you are there realism".
You want to deliver to the ears of the home listener the same sound
that he would have heard had he been at the position of the recording
microphone during the performance. In the case of a symphony orchestra
being recorded in a real concert hall there is no reason not to put
the recording microphone at the best seat in the hall, say fifth row
center. But what kind of main two-channel microphone are we talking
about?
First we do not want this microphone to pick up any sound
from the sides, rear or ceiling of the hall because the bulk of the
hall ambience will be recovered by convolution and we do not want any
off stage early reflections or reverberation tails coming from the front
speakers during reproduction. So let us assume that the microphone sitting
in the fifth row is reasonably baffled to the rear, sides and overhead
with high frequency sound absorbing material. If the microphone pair
is baffled this way, then we can use uncolored, high quality omnidirectional
microphones to good effect. We can also ignore such parameters as the
recording angle and indirect ambient pickup in deciding on placement.

We observe that in the case of central stage sound there
is no erroneous head shadow involved since the sound sources, the microphones
and the speakers are front, center and in line so that very little sound
is colored by having to go around the head to reach a pinna.
However, when a sound originates from the side of the
stage there will be no head shadow function, (or interaural level differences
in the case of omnis) and the rule is that there must be at least one
and only one. The answer is to mount the two omnidirectional microphones
on the surface of a head shaped ball so that as sound from the sides
of the stage impinges on the microphone position a head shadow will
be produced. Fig. 3 On reproduction the sound
will still be coming directly from the front so there will not be a
second head shadow introduced. Such a microphone already exists and
was conceived by G¸nther Theile [15].
It is called the Theile Sphere or the Schoeps KFM-6. There is also no
reason why an Ambiophone or KFM-6 using cardioids would not also work.
Although their use could exaggerate the head shadow and thus distort
the far sides of the stage image this would be an area where recording
engineers could continue to express their artistic sensibilities.
While a microphone shaped less like a sphere and more
like an average head without outer ears might be preferable, head shadowing
is not as critical as pinna function. The reason seems to be that sound
passes around the head over the top, around the back, under the chin,
past the nose, and in many more different ways as the head is moved.
[4] Thus it is largely
the delay and attenuation that is significant in head shadowing not
a particular individuals pattern of peaks and nulls as in the case of
the pinna. Remember, realism not absolutism.
Sounds that come from the extreme sides during reproduction
the stage will reach the pinna from the wrong angle. Thus this microphone
and speaker technique cannot produce a stage width of a full 180-degrees
for live recordings. But up to 150 degrees or so, the naturalness of
the stage and the ease of localization are quite apparent. One can observe
that at the angles around 70 degrees to the side, the ear canals are
not much shadowed by the outer ear and the pinna responses are smoother.
[8, 9]
The pinna seem to be meant to be somewhat more sensitive to sound from
the front and rear middle so, if there is going to be an error in the
system, it is better to have it at the front-side extremes, as in Ambiophonics,
rather than in the median plane, as in stereo.
The theoretically perfect Ambiophone is thus a baffled
two channel pinnaless dummy head microphone placed at the best seat
in the house. With multichanneI DVD-A or SACD it would be theoretically,
if not legally, possible to record with Ambiophones placed at say the
fifth, tenth, and twentieth rows so that the home listener can have
a choice of seats in the hall. (Figure 3) Ideally
the impulse responses from these same locations would be available to
recreate the precise ambience.
I appreciate that stereo recording engineers are hopelessly
addicted to spot microphones. However, in Ambiophonics it is difficult
to mix spots into the two main channels, and still maintain the same
fifth row center perspective and include the required head shadow. In
the Ambiophonic recordings produced so far, the use of a spot microphone
in one of them was painfully obvious. In the case of studio recordings
of relatively small pop ensembles the use of spots would make less difference.
As discussed below, the use of Ambiophonic studio monitoring systems
would likely make the demand for spot microphone use by musicians much
less likely.
An Ambiophonic spot microphone method has been proposed
by Studer and Prof. Angelo Farina. One measures the impulse response
of the stage at each point where a directional spot microphone is to
be placed. This is done by launching a test signal from the stage to
the Ambiophone and measuring the two-eared impulse response. After the
recording is made the spot microphone signal can be convolved with each
impulse response and added in to the appropriate left and right main
Ambiophone channels.
I should observe however, that for concert hall realism
this added complexity should never be needed. If the conductor thinks
an instrument is not prominent enough during a concert or recording
session he/she has presumably done something about it by moving the
instrument forward or using risers. One could also elevate the Ambiophone
to get a better view of the stage than most of the audience has. In
general, trying to improve on concert hall practice is likely to please
only some of the listeners some of the time.
As is the case in Ambisonic recordings made with a Soundfield
microphone placed at the best seat in the house, an orchestral concert
recorded by Robin Miller of Filmaker Inc. from a 10th row seat in the
hall (well beyond the critical radius) using an Ambiophone has a normal
stage perspective. Both Ambisonic and Ambiophonic recordings demonstrate
that close up main and spot microphones, thereby always getting a conductor's
view are not necessary although it may be preferred in some situations.
Monitoring Ambiophonic Setups
Even if you only want to monitor stereo or 5.1 recording
and processing setups, an Ambiophonic layout can be advantageous. The
most efficacious Ambiopole arrangement is the use a mechanical [14]
version of the stereo dipole. Fig. 4 This is simply
two very small high-quality monitors such as satellite speakers placed
against either side of a three-foot square one inch thick wooden panel.
When you listen at the far edge of this panel to the front stage sound
it is very easy to hear what the perspective is, what the stage width
seems to be and any other anomalies such as echoes etc. The clarity
of such a monitoring arrangement makes it easier for musicians to judge
the quality of a trial playback. In the limited experience we have had
so far with this, the performers have been quite pleased and asked for
fewer changes than is normally the case in such situations.

Surround speakers should always be on and used when monitoring
the live microphone setup or a trial playback. This is a good practice
even if the recording is stereo, 5.1 or Ambiophonic. A simple four or
six-channel hall surround convolver, driving small speakers in the control
room, makes judging the quality of the sound being captured much easier
to appreciate and facilitates getting a stamp of approval for both musical
and engineering efforts. Fig. 4
I believe once recording engineers become familiar with
the use of Ambiovolvers that they will find it much more realistic,
trouble free, and cost effective to derive the surround signals for
standard 5.1 or 6.0 recordings from the impulse response of the hall
than to site microphones and hope to capture hall sound during the live
session. Once the classical music public becomes used to programming
their own halls at home, 2 channel live concert and opera recording
engineers will have much less to fret about.