The Science of Domestic Concert Hall Design  
by Ralph Glasgal 



e WIFR Structure 
Ambiophonic Principles for the Recording and Reproduction of Surround Sound for Music  Part 3 Angelo Farina, Ralph Glasgal, Enrico Armelloni, Anders Torger 3. CROSSTALK CANCELLATION The approach employed here is derived from the formulation originally developed by Kirkeby and Nelson [5], with refinement from one of the authors [6]. The following fig. 3 shows the crosstalk phenomenon in the reproduction space:
The 4 crosstalk canceling filters f, which are convolved with the original binaural material, have to be designed so that the signals collected at the ears of the listener are identical to the original signals. Imposing that p_{l}=x_{l} and p_{r}=x_{r}, a 4x4 linear equation system is obtained. Its solution yields: The problem is the computation of the InvFilter (denominator), as its argument is generally a mixedphase function. In the past, the authors attempted [7] to perform such an inversion employing the approximate methods suggested by Neely&Allen [8] and Mourjopoulos [9], but now the KirkebyNelson frequencydomain regularization method is preferentially employed, due to its speed and robustness. A further improvement over the original method consists in the adoption of a frequencydependent regularization parameter. In practice, the denominator is directly computed in the frequency domain, where the convolutions are simply multiplications, with the following formula: Then, the complex inverse of it is taken, adding a small, frequencydependent regularization parameter: In practice, ε(ω) is chosen with a constant, small value in the useful frequency range of the loudspeakers employed for reproduction (80 ñ 16k Hz in this case), and a much larger value outside the useful range. A smooth, logarithmic transition between the two values is interpolated over a transition band of 1/3 octave. Fig. 4 shows the user interface of the software developed for computing the crosstalk canceling filters:
This software tool was implemented as a plugin for CoolEdit [10], and it can directly process a stereo impulse response (assuming a symmetrical setup, so that h_{ll}=h_{rr} and h_{lr}=h_{rl}), or a complete 2x2 impulse response set, obtained first by processing the binaural IR coming from the left loudspeaker, followed in time by the binaural IR coming from the right loudspeaker. In both cases, the output inverse filters are in the same format as the input IRs. The computation is so fast (less than 100 ms) that it is easy to find the optimal values for the regularisation parameters by a trial and error method. 3.1 Realtime implementation of crosstalk canceling through Warped FIR filters The filters described in the previous section are in the form of standard FIR filters. As they have to implement substantial boost and fine detail in the low frequency region, they have to be quite long (typically more than 4096 taps at 44.1 kHz). Thus it is almost impossible to implement them on standard DSP boards in the basic timedomain form. Although frequencydomain implementation, as described below, can easily resolve this problem of running audiophilequality crosstalk canceling filters on currently available DSP boards another possible approach is the use of Warped FIR structures. WFIR features a variable resolution in the frequency domain, and therefore is an effective variation in FIR filter design. Let us consider the following bilinear transformation: where the parameter λ, referred to as warping coefficient can vary between ñ1 and 1. This transformation is the basis of the frequency warping technique. It results in a remapping of the complex plane, so that the z frequency plane is changed into a new ζ complex plane. This bilinear transformation is graphically represented in fig. 5 as a function of λ. The application of this transformation to the spectrum of an audio signal results in a stretching of the signal spectrum so that it becomes approximately logarithmic and thus more consistent with a psychoacoustics frequency scale, like the Bark scale [11]. The main advantage is that the transformed signal is more consistent with human hearing capabilities. Therefore the warped filters have higher accuracy at low frequencies, where the human ear has a higher sensitivity, and lower accuracy at high frequencies.
In a classical FIR filter the frequency resolution is constant over the entire frequency range. Since human frequency resolution is about one third of an octave the equalization is unnecessarily fine at high frequencies and too coarse at low frequencies. Therefore very long FIR filters are required to obtain good results over the entire frequency range. A warped filter based on the Bark scale provides a more efficient equalization at low frequencies. Specifically a warped FIR filter can be implemented with a number of taps ten times lower than those of a FIR filter, but still featuring the same lowfrequency equalization. Its realtime implementation, however, requires more computational power. The Warped FIR structure is derived from the traditional FIR, where unit delays are replaced by the allpass operators D_{1}(z): Unfortunately, this structure is not suitable for realtime processing. Thus an equivalent explicit structure was developed, shown in fig. 6, which allows for efficient implementation [12]. Due to the introduction of the D_{1}(z) allpass block the warping produces a distortion of the complex plane. The analysis of the warped zplane shows that the points on the unitary circle are kept on it, the points inside are kept inside, and the points outside are kept outside. Therefore an unstable system cannot become stable, while a stable system remains stable. This means that a warped FIR filter is always stable, even though it is no longer a "finite response" filter, as the network shown in fig. 6 contains loops.
It can be shown that for the points near +1 the distance from the unitary circle increases, whilst it decreases for the points near 1. Therefore the timedomain behavior of a warped signal is remarkably changed. As an example, let us consider a simple system, whose ztransformation features only a pole a on the real axis. Its expression in the zdomain, and in the time domain is respectively: The time constant τ is defined as the time necessary to reduce the system output to 36.7% of the maximum value, i.e. to the 1/e percentage of the maximum. Then, if α=0.99 the time constant is equal to about 100 samples. If the system is warped with λ=0.8, the abovementioned pole (0.99) is remapped to 0.9135. On the other hand, a system with a high frequency pole near the Nyquist frequency, e.g. α=0.99, would be remapped to 0.9989. This means that the time constant for the low frequency pole is just 12, whilst it is 900 samples for the high frequency pole. In other words, when an impulse response is warped with a positive λ, the low frequency information is compressed in the first samples of the warped impulse response, while the high frequency components are stretched toward the last samples. Thus, a warped impulse response can be truncated after a few samples, without losing low frequency information. This property holds especially for high values of λ. << Previous Page  Next Page >> Article Pages 1  2  3  4  5  6  7  8  9  10 