In each of the 27 lowest subbands used in the encoding, blocks of 12 decimated samples are formed and block companded, i.e.,divided by scale factor such that the sample of the largest magnitude is unity. Each block corresponds to 12*32=384 input samples;which corresponds to 8 ms of audio at a sampling rate of 48 kHz. The choice of block length is affected by two conflicting requirements: on one hand, longer blocks reduce the side information bit rate, and on the other hand pre-masking is only effective with short blocks. Each spectral component is quantized whereby the number of quantizer levels for each component is obtained from a dynamic bit allocation rule that is controlled by psychoacoustic model. The model computes SMR, the global masking threshold, for each 12 sample block via an FFT. The bit allocation algorithm selects then one uniform midtread quantizer out of a set of available quantizers such that both the bit rate requirement and the masking requirement are met. The quantized spectral subband components are then transmitted to the receiver together with scalefactor and bit allocation information. Note that psychoacoustic model is only needed in the encoder, which makes the decoder less complex, a desired feature for audio playback and audio broadcasting applications.
Figure 9: Block structure of ISO/MPEG audio encoder and decoder, Layers
I and II