High-definition telephony over heterogeneous networks
Authors
More about the book
As of today, the lion’s share of the worldwide (fixed and mobile) telephone connections is still restricted to audio frequencies below 4 kHz, leading to the familiar sound character of “telephone speech.” Meanwhile, several coding standards for “High-Definition” (HD) telephony are available which offer a significantly better audio quality and speech intelligibility. However, the required costly and time-consuming modifications of the existing network equipment turned out to be a major obstacle for their introduction. Consequently, a long transition period from today’s plain old telephony towards future HD voice networks can be expected. To account for this situation, in this thesis, concepts, methods and algorithms are investigated, evaluated, and compared that facilitate a major audio quality upgrade of existing speech communication systems while maintaining backwards compatibility with the installed infrastructure. The following principal scenarios are addressed: Bandwidth Extension for Embedded Speech and Audio Coding Two new bandwidth extension (BWE) algorithms are discussed which have been developed in the context of recent ITU-T standardization projects for embedded speech and audio coding. • Artificial Bandwidth Extension without Auxiliary Information Additional audio frequencies can be estimated from the received, band-limited signal alone. A consistent quality improvement is obtained, but the quality does not reach the level of the embedded codec. • Bandwidth Extension with Steganographic Parameter Transmission Data hiding techniques are used to deliver the BWE information to the receiving terminal without altering the bitstream format of the legacy speech codec. The inaudibility of the hidden information is ensured by a joint source encoding and data hiding procedure. As a practically relevant application, this concept is applied to ACELP (Algebraic Code Excited Linear Prediction) codecs as used in GSM and UMTS mobile telephony. The key advantage of the proposed solution is its full backwards compatibility with the legacy codec standard, i. e., the existing network infrastructure can be kept and used without any modifications.