Contributions to improved hard- and soft-decision decoding in speech and audio codecs
Authors
More about the book
Source coding is an essential part in digital communications. In error-prone transmission conditions, even with the help of channel coding, which normally introduces delay, bit errors may still occur. Single bit errors can result in significant distortions. Therefore, a robust source decoder is desired for adverse transmission conditions. Compared to the traditional hard-decision (HD) decoding and error concealment, soft-decision (SD) decoding offers a higher robustness by exploiting the source residual redundancy and utilizing the bit-wise channel reliability information. Moreover, the quantization codebook index can be either mapped to a fixed number of bits using fixed-length (FL) codes, or a variable number of bits employing variable-length (VL) codes. The codebook entry can be either fixed over time or time-variant. However, using a fixed scalar quantization codebook leads to the same performance for correlated and uncorrelated processes. This thesis aims to improve the performance of speech and audio codecs with FL and VL codes. The thesis can be divided into three main parts: Firstly, the concept of FL/SD decoding is applied to the Adaptive Multi-Rate Narrowband (AMR-NB) and AMR Wideband (AMR-WB) speech codecs, which are widely used in mobile speech communications. In addition, new approaches exploiting both interframe and intraframe redundancy for the spectral envelope parameters in both codecs are proposed. The speech quality is significantly improved, both for AMR-NB and AMR-WB. Secondly, the links between the FL/SD and VL/SD decoding algorithms are derived. The tradeoffs of the two SD decoding approaches are discussed. Both the FL/SD and VL/SD decoding methods are applied to High-Efficiency Advanced Audio Coding (HE-AAC), which is optimized for low bit rate applications, such as mobile music streaming and digital radio broadcasting. Supported by subjective listening tests, the audio quality shows a tremendous enhancement. Finally, a new decoding approach is proposed to improve the scalar quantization performance for correlated processes. Exploiting the source correlation and employing only a predictor at the receiver side, a time-variant quantization codebook can be generated. This proposed approach is advantageously applicable in both error-free and error-prone transmission conditions, both with HD and SD decoding. It has also been applied to the G.726 and G.722 Adaptive Differential Pulse Code Modulation (ADPCM) speech codecs, which are utilized in cordless and IP telephony. An improved speech quality is observed.