Contributions to statistical modeling for minimum mean square error estimation in speech enhancement
Authors
More about the book
This doctoral dissertation deals with minimum mean square error (MMSE) speech enhancement schemes in the short-time Fourier transform (STFT) domain with a focus on statistical models for speech and corresponding estimators. The thesis starts with an overview of state-of-the-art MMSE speech enhancement approaches. Then, considering MMSE speech enhancement under speech presence uncertainty (SPU), a new a posteriori speech presence probability (SPP) estimator is derived based on a novel statistical model for speech. Then, a synopsis of approaches to consistent MMSE estimation under SPU is given. In the context of recursive MMSE estimation, interesting links between speech enhancement and error concealment are discussed. Furthermore, this thesis provides a new statistical framework for recursive MMSE speech enhancement. This advantageously allows for applying the improved statistical models from classical, non-recursive speech enhancement to the recursive case. As a specific enhancement scheme, we extend recursive MMSE estimation by taking SPU into account. Finally, a new reference-free signal-to-noise ratio (SNR) measurement approach is proposed in this thesis. This approach aims at estimating the SNR of a speech signal distorted by car noise as close as possible to reference-based measurement approach according to ITU-T Recommendation P.56, but in a reference-free fashion.