Speech coding with a bandwidth wider that that offered in telephony
results in major improvement in represented speech quality [4, 5]. The CCITT
G.722 wideband speech coding algorithm supports bit rates of 64, 56
and 48 kbps. The codec can be integrated on one chip and its overall
delay is around 3 ms, small enough to cause no echo problems in
telecommunication networks.In addition codec provides acceptable
performance (maintain intelligibility) for transmission bit error
rates up to
. This requirement ensures that performance degrades
gently even under the worst transmission conditions that one may
encounter one the telecommunication network.
Figure: Block Diagram of the 64-kbit/s (7kHz) Audio Coder
In the CCITT wideband speech coder a sub-band splitting, based on two
identical finite impulse response
quadrature mirror ( bandpass ) filters (QMF) as shown in
Fig
, divides the 16
kHz sampled 14 bit PCM representation of the wideband input signal
into two critically subsampled ( 8 kHz sampled ) components, called
low sub-band and high sub-band as shown in Fig
.
The filters overlap, and aliasing will
occur because of the subsampling of each of the components; the
synthesis QMF filterbank at the receiver ensures that aliasing products
are cancelled . However, quantization error components on the two
sub-bands will not be eliminated. Therefore, 24-tap QMF filters with a
stop-band attenuation of 60 dB are employed [3].
Figure: Frequency response of the QMF filters
The coding of the sub-band signal is based on a modified version of the 32kbps CCITT G.721 ADPCM speech coder. Input samples are adaptively predicted, , the prediction error signal is quantized and transmitted. The predictor is backward adaptive, i.e. the predictor coefficients are updated sample-wise under the control of the already coded difference signal that is also available at the decoder. The predictor uses a pole-zero structure with six zeros and two poles. It combines good prediction gain ( with an equivalent gain in overall SNR) and simple stability control. The quantizer is also backward adaptive and can rapidly adapt itself to the changing statistics of the speech signals. After transmission errors, both predictor and quantizer converge ( in the long term ) to identical values once no more transmission errors are observed.
High quality coding with the G.722 wideband speech coder is provided by a fix bit allocation, where the low and high sub-bands ADPCM coders use a 6b/sample and 2b/sample quantizer, respectively. In the low sub-band the signal resembles the narrow-band speech signal in most of its properties and a high SNR in the lower band becomes perceptually more important than in the higher band. Dynamic range of the low band signal quantizer was set to be 51 dB and high band signal quantizer was set to be 66 dB, mostly to accommodate music signals.To prevent all zero code from appearing even in the 4 bit data representation, only 15 quantizer levels are used. This also restricts the 5 and 6 bit data representation to 30 and 60 quantizer levels.
An advantage of design that uses two equally wide sub-band is that each component is subsampled two 8 kHz and total transmission rate may be reduced in 8 kbit/samples steps by reducing the number of bits assigned to samples of one or the other band.
Embedded encoding is used in the lower sub-band ADPCM coding, i.e. the adaptations of predictor and quantizer are always based only on the four most significant bits of the each ADPCM codeword. Hence stripping of one or two least significant bits from the ADPCM codewords does not affect the adaptation processes and cannot lead to mistracking effects otherwise caused by different decoding processes in transmitter and receiver. By stripping of one or two least significant bits, it is possible to support transmission of auxiliary data at rates of 8 and 16 kbps [5].