Ambiophonics combines four technologies to produce realistic
sound fields and actually does it optimally via two-channel recording media.
The technologies are convolution for hall ambience, room/speaker
treatment/correction, front loudspeaker crosstalk and pinna angle error
elimination, and an optional superior recording microphone design and
placement. The basic tenet of Ambiophonics is to recreate at the listening
position an exact replica of the original concert hall sound field.
Ambiophonics does this by transporting the sound sources and the stage and
hall ambience to the listening room. In other words, Ambiophonics delivers an
externalized binaural effect, using, as in the binaural case, just two
recorded channels but with two front stage reproducing loudspeakers and eight
or so ambience loudspeakers in place of earphones. Ambiophonics generates
stage image widths up to 140ƒ with an accuracy and realism that far exceeds
that of any other 2 channel or multi-channel recording/reproducing scheme.
While there are Ambiophonic ways to get a direct sound stage extending to 180
degrees, I, for one, have never experienced such a wide angled stage at a live
concert and so this aspect is not considered here.
We will now discuss how to reproduce the front stage of a two
channel recording without exposing our ears to comb filtering, phantom imaging
or major errors in the angle of sound incidence on the pinna and how best to
make recordings that take advantage of Ambiophonic binaural technology. At
this point you may want to review the material in the preface on the
psychoacoustic deficiencies inherent in the stereo triangle.
Making Good on the Promise of
Binaural Technology
Since we have only two ears, it seems reasonable that only two
signals should need to be recorded. Indeed it was Blumlein's original idea
that he could externalize the earphone binaural effect using spaced
loudspeakers and some novel microphone arrangements. But once you give up
earphones for stereo loudspeakers, the interaural-crosstalk and the arbitrary
speaker angle destroy the almost perfect, but internalized (within the skull),
binaural frontal stage image and with all the hall ambience now coming
entirely from the front the hall ambience sounds unnatural. Binaural theory
says that if you sit in the concert hall with small microphones in your ear
canal, record the concert, and then later play it back with in-the-ear canal
earphones you will experience an almost perfect "you are there"
recreation. The only flaw in this method would be that when you moved your
head, while listening or recording, the reproduced stage would rotate
unrealistically. But let us consider, briefly, why this recording method can
otherwise produce an awesome reality.
First of all, the sound from the stage and the hall during
such a personal binaural recording reaches your ear canal (and the imbedded
microphones) after being filtered by your pinna and your head shape. Since the
playback earphones we are using are an in-the-ear-canal type the sound only
passes through the pinna or around the head once. Also the pinna used to make
the recording are your own, not those on some dummy head carved in wood or
plastic. The two channels are kept separate throughout and the left ear
playback earphone signal never leaks into the right ear or vice-versa. Thus we
can state one of the basic rules of realistic binaural recording technology.
In any binaural recording or reproduction chain there should be one and only
one pinna function and it must be your own. There must also be one and only
one head shadowing entity but in this case whose head it is not critical. That
the head shadowing function is not as individual as the pinna function can be
understood when one realizes that sound passes around the head over the top,
under the chin, around the back, and varies as the head is tilted or rotated.
Thus the brain is not overly sensitive to the exact shape of a particular head
or the exact frequency response of the head shadowing function, within reason.
So let us see how we can make use of this knowledge. Let us
assume that we have a two-channel recording made using a dummy head that has
no pinna. This dual microphone is sitting tenth row center. Its signals are
then recorded and played back over two loudspeakers directly in front of the
home listener. Let us assume for the moment that these loudspeakers are like
laser beams so that their sound is aimed precisely at the proper ear. In this
case the listener hears what the corresponding microphone hears and the sound
impacts his own pinna with very little incident angle error for central stage
sources. For stage sources that are more to the side, the listener hears the
head response transfer function of the microphone head and for normal stage
widths this is quite realistic. But now the home listener can rotate his head
and the image is stable just as if he were in the concert hall. So this
technique is not only equal to but also superior to the earphone method
considered above. There is a pinna angle error for stage sources toward the
extreme left and right but fortunately these are the angles where direct sound
has a more or less clear shot at getting to the ear canal directly without
extreme pinna filtering and also where nature has compensated for the decrease
in pinna sensitivity by making the interaural head shadowing most pronounced
providing strong and natural horizontal plane localization. In practice, both
IMAX and Ambiophonics easily demonstrate that this binaural technology is
exceptionally realistic and does produce wide front stages that even allow the
cocktail party effect to be in evidence.
Ambiopoles
Now the question is how to make a pair of center front
speakers behave like sound lasers. There are two possibilities. One is to put
a physical wall or panel in front of the listener. This wall extends to within
a foot or so of the listener's head and keeps the left speaker from radiating
to the right ear and vice-versa. This technique works perfectly and if you are
an audiophile and want absolute fidelity without cables or extra processing
this is a very inexpensive way to go. You can try it first with a mattress on
end, if you want to experiment and have some fun.
While I appreciate that the use of a barrier will never find
universal acceptance, an understanding of how it works is necessary to an
appreciation of what a software version of such a crosstalk avoidance system
must accomplish. You can make a barrier out of sound absorbing panels with a
cutout at the end of it so that it is possible to sit comfortably at the end
of it. The thickness of the barrier is not critical, but should be about six
to eight inches wide so that when a listener is seated their right eye cannot
see the left speaker and vice versa. The wall extending back toward the space
between the speakers is, preferably, made with sound absorbing material. This
panel can be thought of as a collimator for most sound except the low bass. It
eliminates all stray rays from the right that might be heading left and those
from the left that might be heading right. A panel such as this is very
effective in dampening higher frequency room reflections since it absorbs rays
coming from both room sides.
The use of an outdoor reflective barrier to eliminate
stereophonic crosstalk was described in 1986 by Timothy Bock and Don Keele Jr.
at the 81st Audio Engineering Society Convention. While Ambiophonics uses an
absorbent barrier, their results are still largely pertinent. They determined
that a listener could be further back from the end of the barrier if the
barrier was wider, the speakers are closer together, and the listener further
from the speakers. Stated as an equation:
L=X(H+T)—D
Where, in inches, L is the maximum distance a listener's head
can be from the barrier, X is the distance from the listening end of the
barrier to the position of the speakers, D is the distance between the centers
of the speakers H is the distance between the ears, and T is the thickness of
the barrier. For a worst case scenario of a six-inch head, a six-inch thick
barrier, an eight-foot distance to the speakers, and a speaker separation of
three feet (too much) a listener could be as much as 32 inches, almost three
feet from the end of the barrier. Thus the use of a barrier does not in any
way make listening uncomfortable or claustrophobic.
Our own Ambiophonic barrier geometry allows one to be four
feet from the end of the barrier, but at the far end of this range one's head
must be more precisely centered. With a four-foot space, two in-line listeners
can enjoy the enhanced angular image separation at the same time and indeed
the front listener acts as a continuation of the barrier for the second
listener. If in doubt about the spacing, the eyeball method is very
conservative. As long as no part of the opposite loudspeaker is visible from
one eye, excellent separation is guaranteed. Sitting too close to the barrier
is not only unpleasant but results in a loss of high-frequency response if the
barrier is as wide as the head and absorptive.
However, the mainstream way is to use software and a computer
or digital signal processing system to eliminate the crosstalk. I call a pair
of speakers, designed for this purpose, that use the public domain software
that we have developed to do this, an Ambiopole.
First, although most speakers can be used to form an Ambiopole,
it is best if the speakers chosen are very directional and well matched. A
slightly concave electrostatic panel (called an Ambiostat) can actually focus
sound well enough that it almost behaves like the laser we have hypothesized.
Obviously, if the speakers are focused and time aligned, the software can do
its job much better. What the software does is generate slightly delayed
reversed polarity signals for the speakers to cancel the crosstalk
acoustically before it reaches the ear canal. The cancellation is an infinite
series process since the crosstalk caused by the cancellation signal also
produces crosstalk, which must then be cancelled and so on.
If the Ambiopoles were widely spaced, then the crosstalk would
have to go around the head and the correction signals would be very difficult
to calculate since they would be affected by head position and pinna shape.
Thus the front speaker pair should be as close together as possible with
ten-degrees or less between them so that both the main front speakers emit
directly to their onside ears.
Another way of looking at this process is to consider the
mechanical barrier again. The barrier works perfectly every time. If you put a
microphone at the ear position at the end of the barrier and measure the
crossed impulse response of the system and then convolve the main front
Ambiopole signals with this response you can create software that is useable
with that speaker type and speaker angle after the barrier is removed. Just as
it is obvious that a barrier will work better with close together speakers,
since speaker proximity makes it easier for the barrier to shadow the
appropriate ear, so crosstalk software works better if the speakers are closer
together.
Ambiopoles do have a sweet spot limitation although in my
experience the sweet spot is larger than that of most well focused stereo or
5.1 systems. But if the Ambiopoles are constructed using omni-directional
speakers then it is possible to enlarge the sweet spot enough to accommodate
two or even three listeners. Unfortunately there are few true omni-directional
speakers available, and so it has been difficult so far to perfect this
application and demonstrate that this variation works to audiophile standards.
Of course, using omni-directional speakers requires that the room be really
well sound treated to avoid the extra wall reflections that are generated by
such a speaker.
Sometime during 2001 the Ambiophonic Institute expects to have
crosstalk cancellation software available for downloading at no charge from
its web site. Eventually it is hoped that manufacturers will use this or
similar software in their products. It would also be possible to provide an
alternate track on a DVD-A to allow crosstalk free playback of music
recordings via an Ambiopole. As discussed below, Ambiopole software can be
tweaked to compensate for the various main and spot microphone or panning
techniques employed to make a particular stereo or three channel recording if
the simpler, optimum, Ambiophone has not been employed.
The Stereo Dipole, AES Preprint 4463
Among the pioneers in the field of crosstalk cancellation are
Ole Kirkeby, and Philip A. Nelson of The University of Southampton and Hareo
Hamada of Tokyo Denki University who developed an electronic version of the
panel in 1996. They have shown that the ideal speaker spacing for a crosstalk
cancellation system be it mechanical or electronic is about 10 degrees. They
refer to two speakers placed so close together as a "stereo dipole".
The electronic filters required to cancel crosstalk in this narrow speaker
arrangement are somewhat easier to design and are more effective since at the
narrower angle there is little diffraction around the head for the correction
signals and so HRTF correction is not necessary. Pinna angle distortion of the
correction signals is also not a major factor and so the crosstalk
cancellation can be allowed to operate over the full upper frequency range
without restricting the size of the listening area or generating the audible
phasiness effects that afflict electronic crosstalk cancellation schemes for
widely spaced loudspeakers.
They also show, that at narrow speaker angles, the path length
difference from a speaker to each ear is so small that the infinite series of
inverted crosstalk cancellation impulses are generated at a rate of over 10
kHz. This allows for very fine definition of the crosstalk cancellation
signals at higher frequencies and makes this process quite accurate using the
DSP power presently available.
University of Parma Ambiopole
Software
The Ambiophonic Institute in conjunction with the University
of Parma has developed an advanced version of the stereo dipole called the
Ambiopole. In their implementation, the crosstalk cancellation operation is
performed through the convolution of the two left and right front input
signals of the recording with a set of 4 inverse filters. Two of these filters
can be selected by the listener based on knowledge of the microphone employed
to make the recording. These inverse filters cancel out a great part of the
microphone-dependent spatial effects. The goal is to convert recordings, made
with other than the ideal Ambiophone described below, sound as if they were so
recorded. In principle, any kind of two or three channel microphoning system
(such as ORTF, M/S, spaced Omnis, Soundfield, Dummy Head, Sphere, etc.) can be
compensated for including even a "virtual" one, as happens when the
stereo mix is obtained by the panning of monophonic sources. Thus this new
software is designed so that almost all two-channel recordings can benefit
from being reproduced Ambiophonically. In practice, an Audiophile listener can
select from a menu of filters the one that makes a particular recording sound
most realistic.
The University of Parma Ambiopole is realized by means of a
single DSP processor programmed with mathematical entities called "warped
finite impulse response" functions or filters. The warping is essentially
a mathematical weighting algorithm that makes it possible to compute the
required crosstalk cancellation signals in real time i.e. while the music is
playing without falling behind or making errors. It is hoped that those
reading this book in the near future will be able to purchase Ambiophonic
system processors that include this software as well as the software for hall
convolution and room correction. Until then, Ambiophonics will remain a do it
yourself technology for audiophile computer experts only.
Bass Response of Ambiopoles
Since Ambiophonics is a binaural based system, it does not
provide the Blumlein loudspeaker crosstalk signal that furnishes the lowest
frequency phase shift localization cues for those few recordings made with a
coincident microphone arrangement such as the Soundfield mic or crossed figure
eight mics in the M/S (mid-side Blumlein configuration). (See the Appendix A
for a detailed analysis of the Blumlein patent and technology.) However, it
should be understood that at very low bass frequencies, the barrier (depending
on its size and absorbency) and its electronic cousins lose their
effectiveness allowing increasing crosstalk as the frequency declines and
therefore amplifying LF phase cues for coincident microphone recordings. This
is basically a non-issue. Remember that the ear's ability to localize bass
frequencies at 80 Hz and below is virtually non-existent. The pinna certainly
has no capability in this frequency range and the head is too small to
attenuate signals with wavelengths measured in tens of feet. Thus the only
localization method available to the brain at very low frequencies is the few
degrees of phase shift between the ears. There is no evidence that the brain
can detect such small phase shifts and thus worrying about crosstalk
elimination at very low frequencies to improve front stage imaging is not
productive.
Indeed, impulse response measurements on the mechanical
crosstalk barrier show that crosstalk cancellation begins to decline starting
at 400 Hz. To be on the safe side the software can go somewhat lower in
frequency before rolling off, but at very low frequencies the power required
to produce crosstalk cancellation at very low frequencies becomes excessive
and is not necessary.
The Ambiophone
Once we know that playback will be Ambiophonic, the question
arises as to whether there is an ideal recording method that can take
advantage of the fact that surround ambience will be derived via convolution,
that the Ambiopole will eliminate crosstalk and phantom imaging, and that the
listening room is sound treated. But I still want to emphasize that although
Ambiophone microphone arrangements can make the Ambiophonic approach to
realism even more effective Ambiophonics works quite well with most of the
microphone setups used in classical music or audiophile caliber jazz
recordings and as indicated above there are software ways to correct existing
recordings if one is really fanatical.
One can heighten the accuracy, if not gild the lily of
realism, of an Ambiophonic reproduction system by taking advantage, in the
microphone arrangement, of the knowledge that in playback, the rear/side half
of the hall ambience is convolved, that there is no crosstalk, that listening
room reflections are minimized and that the front loudspeakers are relatively
close together. Earlier we considered the binaural model where microphones are
inserted in the ear canal of an ideally situated listener. But now the
situation is different. We are going to reproduce the hall ambience by
convolution so we do not want our binaural listener to pick up any hall
ambience from the rear the extreme sides or the ceiling. So let us put sound
absorbing material just behind his head and above him as well so that he has a
sonic view of only the stage in front of him.
Now we know that upon reproduction the Ambiopole speaker sound
will pass by his pinna on the way to the eardrum. Thus we do not want any
pinna at the recording site. Thus the human listener is excused from the
recording site and we are left with a pair of baffled head spaced omni or
cardioid microphones sitting at the best seat in the house. But the rule
stated earlier said there must be at least one and only one head shadow in the
recording/reproduction chain and so, since the home listener is directly in
front of the Ambiopole it is up to the Ambiophone to provide a head shadow. So
let us put a head shaped oval between the two microphones at this best seat in
the house. So our Ambiophone boils down to an oval shaped two capsule assembly
baffled to the rear and above comfortably ensconced at the best seat in the
house or studio.
Nothing New Under the Sun
After completing the above derivation of the ideal Ambiophone,
I began to search for recordings that played back realistically
Ambiophonically to see if they had anything consistent or unusual about them.
Not being a recording engineer or a microphone aficionado, it took me awhile
to notice that many of the best CDs in my collection were made with something
called a Schoeps KFM-6. A picture of this microphone in a PGM Recordings
promotional flyer showed a head sized but spherical ball with two
omnidirectional microphones one recessed on each side of the ball where ear
canals would be if we had an exactly round head. The PGM flyer also included a
reference to a paper by Günther Theile describing the microphone,
entitled On the Naturalness of Two-Channel Stereo Sound, J. Audio Eng. Soc.,
Vol. 39, No. 10, 1991 OCT.
Although Theile would probably object to my characterization
of his microphone, his design is essentially a simplified dummy head without
external ears. He states, It is found that simulation of depth and space are
lacking when coincident microphone and panpot techniques are applied. To
obtain optimum simulation of spatial perspective it is important for two
loudspeaker signals to have interaural correlation that is as natural as
possible........Music recordings confirm that the sphere microphone combines
favorable imaging characteristics with regard to spatial perspective accuracy
of localization and sound color..... Later he states The coincident microphone
signal, which does not provide any head-specific interaural signal
differences, fails not only in generating a head-referred presentation of the
authentic spatial impression and depth, but also in generating a
loudspeaker-referred simulation of the spatial impression and depth......it is
important that, as far as possible, the two loudspeaker signals contain
natural interaural attributes rather than the resultant listener's ear signals
in the playback room.
What Theile did not appreciate is that, for signals coming
from the side, the sphere acts as sort of filter for the shorter wavelengths
just as the head does. When this side sound comes from side stereo speakers
the listener's head again acts as a filter resulting in HRTF squared. The
solution, of course, is to use the mechanical or software Ambiopole barrier
and listen to the Theile sphere without the second head response function.
Theile also "generates artificial reflections and reverberation from
spot-microphone signals." He uses the word artificial in the sense that
the spot microphone signals will be coming from the front stereo loudspeakers
instead of from the rear, the sides, or overhead. While Theile's results rest
as much on empirical subjective opinion as they do on psychoacoustic precepts,
they certainly are consistent with the premises of Ambiophonics both in
recording and reproduction. Making new recordings using the Schoeps KFM-6
version of the Theile Sphere and evaluating existing recordings made with this
microphone show that the theory is correct since such recordings yield
exceptionally realistic front stages with normal concert-hall perspectives and
proscenium ambience.
Realistic Reproduction of Depth
It is axiomatic that a realistic music reproduction system
should render depth as accurately as possible. Fortunately, front stage
distance cues are easier to record and/or recreate realistically than most
other parameters of the concert-hall sound field. Assuming that the recording
microphones are placed at a reasonable distance from the front of the stage,
then the high frequency roll-off due to distance and the general attenuation
of sound with distance remain viable distance cues in the recording. Depth of
discrete stage sound sources is, however, more strongly evidenced in
concert-halls by the amplitude and delay of the early reflections and the ear
finds it easier to sense this depth if there is a diversity of such
reflections. In Ambiophonics, convolved early reflections from the surround
speakers make the stage as a whole seem more interesting, but it is only the
recorded early reflections coming from the front speakers that provide the
reflections that allow depth differentiation between individual instruments.
This is why anechoic recordings sound so flat when played back
stereophonically or even Ambiophonically, despite the presence of an added
ambient field. In ordinary stereo, depth perception will suffer if early side
and rear hall reflections wrap around to the front speakers or in the anechoic
case, are completely missing. Since it is easy to make Ambiophonic recordings
that include just proscenium ambience, why not do so and save on convolver
processing power and preserve, undistorted, the depth perception cues?
There remains the issue of perspective, however. When making a
live performance recording of an opera or a symphony orchestra the recording
microphones are likely to be far enough away from the sound sources to produce
an image at home that is not so close as to be claustrophobic. There are many
recordings, however, that produce a sense of being at or just behind the
conductor's podium. This effect does not necessarily impact realism but you
must like to sit in the front row to be comfortable with this perspective.
Turning down the volume and adding ambience can compensate for this, but with
a loss in realism. This problem becomes more serious in the case of solo piano
recordings or small Jazz combos. For example, if a microphone pair is placed
three feet from an eight foot piano, then that piano is going to be an
overwhelming close-up presence in the listening room and a
"They-Are-Here" instead of a "You Are There" effect is
unavoidable. This will be very realistic especially with the Ambiopole, but
adding real hall ambience doesn't help much since the direct sound is so
overwhelming. The major problem with this type of recording is that you have
to like having these people so close in a small home listening room. You may
notice that demonstrators of high resolution playback systems in show rooms or
at shows, overwhelmingly, use small ensemble, solo guitar, single vocalist
etc., close mic'ed, recordings to demonstrate the lifelike qualities of their
products and that these demonstrations are mostly of the "They Are
Here" variety.
These depth and perspective problems are easily solved by
simply placing an Ambiophone at a seat that has a reasonable view of the
performers.