The general form of the noise shaping filter in CELP coding is a weighting function
where A(z) is the LPC polynomial. The effect of
or
is to move
the roots of A(z) towards the origin , de-emphasizing the spectral
peaks of 1/A(z). With
and
as in
, the response of W(z) has
valleys(anti-formants) at the formants locations and the inter-formant
areas are emphasized. In addition, the amount of an overall spectral
roll-off is reduced, compared to the speech spectral envelope as given
by 1/A(z). Noise is less audible if it shares the same spectral band
with a high level tone like signal.
The spectral dynamic range of wideband speech is considerably higher
than that of telephone speech, and the amplitudes of the 3400-7000 Hz
components are usually near the bottom of this dynamic range. The
unweighted SNR in CELP coding tends to be negative at high
frequencies. The auditory system is quite sensitive in this region and
the quantization distortions are clearly audible in a form of
crackling and hiss. Noise weighting is therefore very crucial in
wideband CELP and the balance of low and high frequency fidelity is
quite delicate.
The filter W(z) in (
) has inherent limitation in modeling the
formant structure and the spectral tilt concurrently. The spectral
tilt is more or less controlled by the difference
and
. The tilt
is global in nature and it is not possible to emphasize it separately
at high frequencies. Also, changing the tilt affects the shape of the
formants of W(z). A pronounced tilt is obtained along with higher and
wider formants, which puts to much noise at low frequencies and in
between the formants. The formant and tilt problems need to be
decoupled. The approach taken in was to use
W(z) only for formant
modeling and to add another section for controlling the tilt
only. The general form of the new filter is
where P(z) is responsible for the tilt only. Various forms of P(z) were studied, and the following two pole section was proposed based on listening test
The coefficients
are found by applying the standard LPC
algorithm to the first three correlation coefficients of the current
frame LPC inverse filter (A(z)) sequence
. The parameter
is
used to adjust the spectral tilt of P(z). The values
,
,
were found to yield the best perceptual performance.
Figure
demonstrates the effect of the enhanced noise shaping
filter. The broken curve in the figure shows a typical spectrum of a
conventional inverse filter
. The solid curve is the spectrum
of an enhanced inverse filter
for the same LPC
filter. As seen, for the same general tilt, the enhanced filter has
less pronounced formants especially at lowest and highest
formants. Compared to the conventional filter, the enhanced filter
attenuates noise by about 5 dB at the lowest and highest formants,
while achieving the right overall spectral tilt.
The CELP coder with the noise weighting described above was
implemented with an LPC predictor order of 32, a coding block size of
5, a codebook size of 1024, and without a pitch loop.
Figure: Effect of the enhanced noise-shaping filter. Dashed line:
spectrum of a conventional inverse filter
.
Solid line: spectrum of an enhanced inverse filter
.