next up previous

The CCITT G.722 Wideband Speech Coding Standard

Speech coding with a bandwidth wider that that offered in telephony results in major improvement in represented speech quality [4, 5]. The CCITT G.722 wideband speech coding algorithm supports bit rates of 64, 56 and 48 kbps. The codec can be integrated on one chip and its overall delay is around 3 ms, small enough to cause no echo problems in telecommunication networks.In addition codec provides acceptable performance (maintain intelligibility) for transmission bit error rates up to tex2html_wrap_inline315 . This requirement ensures that performance degrades gently even under the worst transmission conditions that one may encounter one the telecommunication network.

Figure: Block Diagram of the 64-kbit/s (7kHz) Audio Coder

In the CCITT wideband speech coder a sub-band splitting, based on two identical finite impulse response quadrature mirror ( bandpass ) filters (QMF) as shown in Fig gif, divides the 16 kHz sampled 14 bit PCM representation of the wideband input signal into two critically subsampled ( 8 kHz sampled ) components, called low sub-band and high sub-band as shown in Fig gif. The filters overlap, and aliasing will occur because of the subsampling of each of the components; the synthesis QMF filterbank at the receiver ensures that aliasing products are cancelled . However, quantization error components on the two sub-bands will not be eliminated. Therefore, 24-tap QMF filters with a stop-band attenuation of 60 dB are employed [3].

Figure: Frequency response of the QMF filters

The coding of the sub-band signal is based on a modified version of the 32kbps CCITT G.721 ADPCM speech coder. Input samples are adaptively predicted, , the prediction error signal is quantized and transmitted. The predictor is backward adaptive, i.e. the predictor coefficients are updated sample-wise under the control of the already coded difference signal that is also available at the decoder. The predictor uses a pole-zero structure with six zeros and two poles. It combines good prediction gain ( with an equivalent gain in overall SNR) and simple stability control. The quantizer is also backward adaptive and can rapidly adapt itself to the changing statistics of the speech signals. After transmission errors, both predictor and quantizer converge ( in the long term ) to identical values once no more transmission errors are observed.

High quality coding with the G.722 wideband speech coder is provided by a fix bit allocation, where the low and high sub-bands ADPCM coders use a 6b/sample and 2b/sample quantizer, respectively. In the low sub-band the signal resembles the narrow-band speech signal in most of its properties and a high SNR in the lower band becomes perceptually more important than in the higher band. Dynamic range of the low band signal quantizer was set to be 51 dB and high band signal quantizer was set to be 66 dB, mostly to accommodate music signals.To prevent all zero code from appearing even in the 4 bit data representation, only 15 quantizer levels are used. This also restricts the 5 and 6 bit data representation to 30 and 60 quantizer levels.

An advantage of design that uses two equally wide sub-band is that each component is subsampled two 8 kHz and total transmission rate may be reduced in 8 kbit/samples steps by reducing the number of bits assigned to samples of one or the other band.

Embedded encoding is used in the lower sub-band ADPCM coding, i.e. the adaptations of predictor and quantizer are always based only on the four most significant bits of the each ADPCM codeword. Hence stripping of one or two least significant bits from the ADPCM codewords does not affect the adaptation processes and cannot lead to mistracking effects otherwise caused by different decoding processes in transmitter and receiver. By stripping of one or two least significant bits, it is possible to support transmission of auxiliary data at rates of 8 and 16 kbps [5].

next up previous

Esin Darici Haritaoglu
Wed Jun 18 22:26:24 EDT 1997