On December 14th, l931 the EMI sound engineer, Alan Dower
Blumlein, filed a British Patent Specification 394325 entitled
"Improvements in and relating to Sound-transmission, Sound-recording and
Sound-reproducing systems." In the usually arcane language common to most
patent applications, Blumlein's invention "consists in a system of sound
transmission wherein the sound is picked up by a plurality of microphone
elements and reproduced by a plurality of loud speakers, comprising two or more
directionally sensitive microphones and/or an arrangement of elements in the
transmission circuit or circuits whereby the relative loudness of the loud
speakers is made dependent upon the direction from which the sounds arrive at
the microphones."
Blumlein did not use the word "stereophonic" anywhere
in his patent, but he did use the word "binaural." It was well known
during the fifty years before Blumlein, that two microphones, spaced the width
of the human head, feeding a remote pair of headphones, produced very realistic
sound images with solid, stable, directional attributes. The problem was that
the sound sources all seemed to lie within ones head or in psychoacoustic
parlance, be internalized. What Blumlein sought to do was to externalize this
binaural effect using loudspeakers. Externalizing the binaural effect over a
full 360-degree sphere is still the Holy Grail of acoustics, particularly among
those designing virtual reality video systems that also require an audio
counterpart. The latest IMAX large screen 3d movie system uses earphones placed
about an inch out from in front of the ears as well as speakers behind the
screen, behind the audience, and above and below the screen to produce a full (periphonic)
acoustic sphere. If home video watchers are prepared to wear earphones as well
as have loudspeakers in their home movie theaters this is a very effective
technology, but one that is not necessary to realistically reproduce staged
musical events as opposed to movies.
Other attempts to externalize the binaural effect over a full
sphere or just a circle, include, Ambisonics, surround sound and the plethora of
computer companies at work generating the virtual reality sound fields for the
multimedia applications referred to above. Fortunately, our problem is, and
Blumlein's was, less complex since we need only consider a relatively small part
of this sphere and we can assume that all direct sound sources originate on a
single flat stage in front of us. In fact, Blumlein's first priority was to
provide a better front stage sound for movies shown in theaters.
Blumlein was awarded his patent covering what we now call
stereophonic sound reproduction officially on June 14th, 1933. Thus the basic
stereo listening triangle is over 65 years old and just as Einstein's theory of
relativity eventually refined Newtonian physics, it may be time to reexamine and
modify the bedrock concepts upon which Blumlein imaging is based. And what
better place to start than with Blumlein himself. Suppose one looked through
Newton's treatises and found cryptic comments by Newton hinting that he knew his
laws of matter, acceleration and gravity were not fully accurate at very high
velocities and masses. We would then be justified in concluding that Newton had
some insight into relativity but chose not to confound his contemporaries who
had enough to deal with in distinguishing between mass and weight and who in any
case found his formulas were always accurate enough to do jobs like getting
rockets off the ground. Newton's laws still work very well today despite
relativity if you are not too fussy.
So it is with Blumlein. Blumlein's patent is salted with
innuendoes and hints of things that should come. Blumlein knew that his
reproduction method using two widely spaced loudspeakers was flawed, but the
improvement in sound reproduction over mono was so apparent that there was no
need to point out in detail its theoretical imperfections, and in any case he
wanted his patent to be awarded and his invention used. However, he seemingly
felt compelled to indicate to his technical posterity that he really did know
precisely what was right and what was wrong with the stereophonic reproduction
method he was proposing. (On the recording side, he had fewer problems and
proposed the coincident stereo microphone and what we now call the Blumlein
shuffler, both concepts later elaborated on in Ambisonics.) Thus in a paragraph
discussing the difference between low frequency phase differences and high
frequency intensity differences in providing directional cues, he writes
"It can be shown, however, that phase differences necessary at the ears for
low frequency directional sensation are not produced solely by phase differences
at two loudspeakers (both of which communicate with both ears) (parentheses
Blumlein's) but that intensity differences at the speakers are necessary to give
an effect of phase difference."
What Blumlein was doing here was indicating that an unavoidable
defect could be a virtue in one case. That is, he could not prevent both
loudspeakers from having equal access to both ears at low frequencies, (or also
having a less predictable access at all higher frequencies), so he came up with
a recommended coincident microphone arrangement that counted on this low
frequency loudspeaker crosstalk to provide for localization in the relatively
narrow low frequency band where the ear can localize only on the basis of
interaural phase differences. Thus crosstalk became a necessary evil in the
coincident microphoning case. What Blumlein was really saying was that if your
microphones produce signals at low frequencies that don't have any phase
differences, (as is the case with any coincident microphones) then the
loudspeaker crosstalk could save the day but at a cost in higher frequency
intensity based localization that Blumlein himself was aware of but could not
fully appreciate because of the limited frequency response of the equipment he
had to work with. The way the loudspeaker crosstalk helps in the low frequency
case is as follows. At low frequencies it can be assumed that any sound from one
speaker will produce the same sound pressure at both ears since the head is not
an effective barrier to long wavelength sounds. But the signal will be slightly
delayed in getting to the more remote ear. If now there is a second loudspeaker
emitting the same low frequency signal, then when this second pair of soundwaves
meets the first pair it will combine with the first pair to form a new soundwave.
When two waveforms, that have the same shape, but differ in amplitude and also
have a fixed time delay between them, are added together, the result is a new
wave shifted in phase. At one ear the louder signal combines with the delayed
softer signal. At the other ear the softer signal combines with the delayed
louder signal. The results are identical amplitudes but different phase shifts
at each ear and thus an interaural phase difference between the ears is created
that is proportional to the original intensity difference between the
microphones. Of course if you use a more common, non-coincident microphone
technique, such as a head spaced array, this crosstalk can cause localization
blurring. That Blumlein understood that this unavoidable crosstalk caused
imaging problems at higher frequencies is clear from some of the other quotes
below. He clearly seemed preoccupied with this issue as he prepared his text. In
point of fact, we know today that this loudspeaker communication with both ears
makes it impossible for standard stereo or its surround sound relatives to
create a fully realistic and lifelike stage image. But wait. There is much more
to be gleaned from Blumlein.
Blumlein's hints to his audiophile posterity continue with
"the sense of direction of the apparent sound source will only be conveyed
to a listener for the full frequency range for positions lying between the
loudspeakers" Thus Blumlein certainly understood that the width of the
stage he could create with loudspeakers was limited by crosstalk to the space
between those loudspeakers, a serious defect, but one that was not crucial to
Blumlein since he was largely concerned with widely spaced loudspeakers in large
movie theaters or halls that had fairly narrow screens or stages in comparison
to the depth of the theater. In the context of a patent application however,
this is not the sort of observation one would ordinarily include.
It is easy to understand why the maximum width of the
stereophonic sound image is limited to the angle the speakers subtend at the
listening position. Let us assume that a single sound source such as a trumpet
is located stage right at 80 degrees. Let us further assume that under these
circumstances the sound reaching the left microphone in a stereo recording setup
is negligible and therefore no audible sound comes from the left speaker during
playback. The trumpet sound blares forth from the right loudspeaker at normal
intensity. If the right speaker is at the usual 30-degree angle from the
centerline of the normal stereo playback triangle, then the trumpet will appear
to be sounding from that position instead of from 80 degrees. This is of course
the everyday real life situation where we can easily locate the source of any
discrete sound that reaches both ears without impediment.
Many of us have, however, heard recordings of stereo systems
that do sometimes produce images that come from beyond the speakers and some
audiophiles believe that if they could only get perfect recordings, speakers,
cables and electronics, the image would open out. Blumlein was also loath to
admit defeat on this point. He writes "but if it is desired to convey the
impression that the sound source has moved to a position beyond the space
between the loudspeakers the modifying networks may be arranged to reverse the
phase of that loudspeaker remote from which the source is desired to appear, and
this will suffice to convey the desired impression for the low frequency
sound." (Hang on to that word "low) This suggestion makes sense in a
particular movie scene where you could briefly reverse the phase of one speaker
to move dialog or a sound effect off screen, but we know that leaving one
speaker out-of-phase all the time does not work for music reproduction via the
stereo triangle. What Blumlein was suggesting is a primitive form of logic
steering thus foreshadowing Dolby Pro-Logic. But he has explained why sometimes
images do appear beyond the position of the loudspeakers. Any inadvertent phase
reversal of a spot microphone in recording or an out-of-phase driver, or a large
phase shift in the crossover network of a three or four way loudspeaker system
or a reflection from the wall behind a dipole loudspeaker can convince even
experienced listeners that wider stages can be achieved, somehow, using normal
stereo technology. Unfortunately, logic steering, surround coding and even
multi-channel recording methods cannot achieve the binaural ideal that Blumlein
was striving for.
So far, Blumlein himself has told us that the stereophonic
reproduction method has two inherent flaws. There is a third problem that
Blumlein seems to have been aware of because of his use of the word
"low" in the last quote. This is the image position distortion caused
by higher frequency sounds that hit the pinnae from angles that do not
correspond to the actual angles of the recorded source. Thus, perhaps Blumlein
had trouble moving a birdcall off stage using his phase reversal trick. A
related issue is the question of recorded ambience and here Blumlein appears to
be struggling with the problem of reproducing such recorded hall ambience from
the proper direction. "The reflected sound waves which arise during
recording will be reproduced with a directional sense and will sound more
natural than they would with a non-directional system. If difficulties arise in
reproduction, they may be overcome by employing a second pair of loudspeakers
differently spaced and having a different modifying network from the first
pair." While the vocabulary may be a bit different, this is a pretty good
description of surround sound or Ambisonics and is also the basic starting point
for the ambience and imaging system I have called Ambiophonics.