next up previous
Next: Subjective Performance Up: Low-delay CELP coding of Previous: Low-delay CELP coding of

Perceptual Noise Weighting

The general form of the noise shaping filter in CELP coding is a weighting function

  equation119

where A(z) is the LPC polynomial. The effect of tex2html_wrap_inline327 or tex2html_wrap_inline329 is to move the roots of A(z) towards the origin , de-emphasizing the spectral peaks of 1/A(z). With tex2html_wrap_inline327 and tex2html_wrap_inline329 as in gif, the response of W(z) has valleys(anti-formants) at the formants locations and the inter-formant areas are emphasized. In addition, the amount of an overall spectral roll-off is reduced, compared to the speech spectral envelope as given by 1/A(z). Noise is less audible if it shares the same spectral band with a high level tone like signal. The spectral dynamic range of wideband speech is considerably higher than that of telephone speech, and the amplitudes of the 3400-7000 Hz components are usually near the bottom of this dynamic range. The unweighted SNR in CELP coding tends to be negative at high frequencies. The auditory system is quite sensitive in this region and the quantization distortions are clearly audible in a form of crackling and hiss. Noise weighting is therefore very crucial in wideband CELP and the balance of low and high frequency fidelity is quite delicate. The filter W(z) in (gif) has inherent limitation in modeling the formant structure and the spectral tilt concurrently. The spectral tilt is more or less controlled by the difference tex2html_wrap_inline327 and tex2html_wrap_inline329 . The tilt is global in nature and it is not possible to emphasize it separately at high frequencies. Also, changing the tilt affects the shape of the formants of W(z). A pronounced tilt is obtained along with higher and wider formants, which puts to much noise at low frequencies and in between the formants. The formant and tilt problems need to be decoupled. The approach taken in was to use W(z) only for formant modeling and to add another section for controlling the tilt only. The general form of the new filter is

equation126

where P(z) is responsible for the tilt only. Various forms of P(z) were studied, and the following two pole section was proposed based on listening test

equation128

The coefficients tex2html_wrap_inline339 are found by applying the standard LPC algorithm to the first three correlation coefficients of the current frame LPC inverse filter (A(z)) sequence tex2html_wrap_inline341 . The parameter tex2html_wrap_inline343 is used to adjust the spectral tilt of P(z). The values tex2html_wrap_inline345 , tex2html_wrap_inline347 , tex2html_wrap_inline349 were found to yield the best perceptual performance.

Figure gif demonstrates the effect of the enhanced noise shaping filter. The broken curve in the figure shows a typical spectrum of a conventional inverse filter tex2html_wrap_inline351 . The solid curve is the spectrum of an enhanced inverse filter tex2html_wrap_inline353 for the same LPC filter. As seen, for the same general tilt, the enhanced filter has less pronounced formants especially at lowest and highest formants. Compared to the conventional filter, the enhanced filter attenuates noise by about 5 dB at the lowest and highest formants, while achieving the right overall spectral tilt. The CELP coder with the noise weighting described above was implemented with an LPC predictor order of 32, a coding block size of 5, a codebook size of 1024, and without a pitch loop.

   figure135
Figure: Effect of the enhanced noise-shaping filter. Dashed line: spectrum of a conventional inverse filter tex2html_wrap_inline351 . Solid line: spectrum of an enhanced inverse filter tex2html_wrap_inline357 .


next up previous
Next: Subjective Performance Up: Low-delay CELP coding of Previous: Low-delay CELP coding of

Esin Darici Haritaoglu
Wed Jun 18 22:26:24 EDT 1997