Model-based speech enhancement exploiting temporal and spectral dependencies
Authors
More about the book
Mobile telephony has become an integral part of everyday life for billions of people around the world. The exchange of information via speech is nowadays possible from almost all places at anytime. However, even though the vision of permanent reachability and connectivity has been realized in the meantime nearly worldwide, there is still room for improvements when it comes to the transmission of speech under noisy conditions. The performance of any speech communication system may significantly deteriorate when the speech signal is disturbed by ambient interferences such as traffic noise or office noise, possibly leading to a poor speech quality and intelligibility. In this thesis, a novel model-based speech enhancement system is presented which performs single-channel noise reduction of degraded speech signals. In contrast to state-of-the-art noise suppression techniques, the developed algorithms explicitly exploit temporal and spectral dependencies of speech and noise signals. To account for the temporal correlation, a modified Kalman filter is derived in the frequency domain. As main novelties, the proposed solution performs complex-valued prediction of speech and noise DFT coefficients and uses SNR-dependent MMSE estimators which are adapted to measured statistics of the input signal. In order to incorporate the spectral dependencies of speech signals, a new wideband speech enhancement system is presented which utilizes techniques known from artificial bandwidth extension. The developed method re-uses the processed and enhanced signal from lower frequencies to improve the results of a conventional noise suppression technique at higher frequencies. As additional part, this work proposes effective countermeasures to reduce the occurrence of musical noise and provides a novel solution for the suppression of rapidly time-varying harmonic noise. All developed speech enhancement techniques within this thesis are thoroughly evaluated by means of instrumental measurements and auditory judgments. It turns out that the proposed algorithms achieve distinctly better results compared to state-of-the-art approaches with respect to noise attenuation and speech distortions. The novel model-based system is not restricted to the application in mobile phones. It can be used in addition to improve the speech quality of hands-free devices, conferencing systems or digital hearing aids.