Header image  
by Ralph Glasgal
line decor
Home Tutorials Tech
Kudos and
Demos Bio Free Ambio
Glossary The Home
Concert Hall
Rec Engineers
FAQ/Forum Links Contact us
line decor

Audio Engineering Society
Convention Paper

Transforming Ambiophonic + Ambisonic 3D Surround Sound to & from ITU 5.1/6.1
Robert E. (Robin) Miller III ©20031

Page 5 -- Other Conversions for 5 or 8 Media Channels?

The discussion above suggests ěPanAmbio 5.1î - a 2D hybrid of Ambiophonic and Ambisonic WXY-only transformed to the five full-range channels of AC-3, and reconstituted when a decoder is provided to the original Ambio+Ambisonic WXY-only.  If the original recording included all four B-format signals, tilting would be possible prior to 5.1 mastering, but would permanently alter the ambience space, as upon replay no reconstituted Z is available for ěuntilting.î  The result is equivalent to sitting in the balcony where one might naturally tilt his/her head to hear the sound produced, although in this case without visual cue.

If 8 full-range channels are available (such as 8 channel Dolby E or 8 of the 48 channels specified in MPEG4/AAC), then 3D ěPan/PerAmbio 8.0î with extra channels can be realized: three in the 2D form above or two in 3D.  Two or three extra channels might be used for L,R or L,C,R spot mics, three second order Ambisonic signals, second (third, fourth) language(s), additional effects speakers, etc.

Stability of center voices; the ěSweet Spotî

For 5.1/6.1 replay, superimposition of the Ambiophonic image with the Ambisonic contribution stabilizes important central voices by anchoring them with a hard Center channel derived from the B-format at 0° azimuth and tilted or not depending on the 2D transformation mode chosen.  Unlike phantom images in 2-speaker stereo or surround mixing that ignores the advantage of a hard center channel, in the PerAmbio 2D transformation central soloists do not toggle to the nearer speaker as one moves around the listening space, but remain a stable central image.

Reconstituted to PerAmbio 3D, the hybrid approach largely solves Ambiophonicsí main disadvantage: By supplementing a large listening area, there is no longer the absolute need to sit on the median plane for enjoyable sound.  In a modest home theater, six music listeners or movie viewers can be accommodated with very plausible surround sound, although the middle two listeners on the median plane will benefit from the frontal localization accuracy of the Ambio pair, as shown in Fig. 8.

Ambisonic transformation in the horizontal plane is a regular hexagon, creating virtual speakers at ±60°, ±120, 0°, and 180°. 
L & R are virtualized wider than the standard ITU angle of 60° in order to match the 120°+ reproduction angle of Ambiophonics, with its inherent frontal image accuracy of ±5° [4,7].

Vertical ambience transformations are necessarily not coplanar, as described above.  However, vertical acuity of human hearing relies on learned pinna response and is much less than horizontal acuity, which relies on HRTF level and time differences (ILD and ITD) of our two ears.  Note that conventional recordings are often made with room and spot microphones placed far from the main microphone and likely not coplanar with it.

Fig.8 - Compatible PerAmbio 6.0.10 full sphere 3D surround layout can accommodate audiences up to six.  For 5.1, viewers sit back 26% of the speaker diameter, where the angles meet ITU standards (with DSP changes in levels/delays).  For 3D music appreciation, one or two listeners sit at the focus of 10 speakers, plus subwoofer(s).

ITU 6.1 TRANSFORMATION MODES: i j k ií jí kí

80 combinations (= 34 - 1) were considered for transformation matrices to encode 3D directionality into 6 full range ITU media channels for reconstituting full periphony, but only about a dozen proved useful.  If metadata permitted unlimited flags to command the userís processor, all 80 could be available to the recording engineer.  Each conversion matrix is at most a 6x6 array of coefficients for each mode, accessible to the DSP by a table lookup.  It is also possible for the user to download newly developed coefficients from the Internet to a decoderís FLASH memory.

This paper presents six ěmodesî for a six-channel main microphone array in six common applications for music recording, cinema ambience, and multichannel broadcasting (see Fig.6 & 7).  Work is ongoing to refine these choices.

The three basic modes, i, j, & k, are so designated for mnemonics that describe their function.  ěiî has both C and SC ěinclinedî upward with respect to the four ěcornerî channels L, R, SL, & SR inclined downward.  ějî ějuxtaposesî C with LR and SC with SL & SR - and is the only letter designation with a descender, which reminds us that C tilts downward.  ěkî lying on its back represents that C and SC angle upward from the corner channels, which lie flat.

The three tilted variant modes rotate C, SC, SL, and SR with respect to L,R by any practical angle, e.g. -30°, in order to raise the microphone (suspended or on a high stand).  The output of the sphere microphone does not vary with height incidence, but the baffled ěambiophoneî does, so physical tilting may be appropriate for the FL, FR channels.  The same applies were an ORTF microphone used for FL,FR. 

The choice of mode may be made either during recording or post-production.  If in post it is desired to change the basic mode or tilt, the original PerAmbio channels may be reconstituted and a new mode and tilt transformation made.  A raised (suspended) microphone perspective is irreversible.  During mastering, a flag is set in metadata of the dual-format recording in order for usersí replay equipment to reconstitute 3D without loss.

From experience, most recording engineers can identify applications that spawned the modes below (cf. Fig. 6, 7).  Or, even without hearing the hall in Fig. 9, which of the following modes would you choose (keeping in mind that you can change it in post)?

i   The microphone array is placed at source level (L, R), below acoustic shell reflections (C), e.g. an outdoor amphitheater event with audience low and behind (SL, SR) and raked upward (SC).

   The microphone array is on a high stand or hanging in opera house or orchestra hall with orchestra widely spaced in a pit or strings downstage (L, R), singers or winds upstage (C), and hall ambience back (SL, SR) and up (SC).

j     The array is more closely placed before a small ensemble at source level for direct sound and early floor and side wall reflections (L, R), higher direct solo and ceiling reflections (C), and hall ambience from back-up (SL, SR) and back-down (SC).

   The array is hanging closer to a proscenium to pickup downstage event sounds (L, R), upstage drama (C), high-back hall ambience (SL, SR), and additional audience pickup (SC).

k   The microphone array is in an arena with sports play-action or musical instruments at microphone level (L, R), and with good high-front (C) and back (SC) crowd sounds or ceiling ambience.

  The array is on a high stand or hanging in a cathedral with upstage choir (C) and front-of-church organ divisions and floor reflections (L, R), organ antiphonal and congregation in back (SL, SR), and organ trumpet directly overhead (SC).

Fig.9 - shows the PerAmbio 6.1.10+main microphone array in a recital hall for experimental recordings to test dual format 2D/3D.  Mode jí (ějuxtaposedî, high and tilted) was preferred for ITU6.1 transformation.

<< Previous Page | Next Page >>

Article Pages 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8

^ Back to Top ^