A microscopic model of speech recognition for listeners with normal and impaired hearing
Authors
More about the book
Degraded speech intelligibility is one of the most frequent complaints of sensorineural hearing-impaired listeners, both in noisy and quiet situations. An understanding of the effect of hearing impairment on speech intelligibility is therefore of large interest particularly in order to develop new hearing-aid algorithms for rehabilitation. However, sensorineural hearing impairment is often found to be very individual in terms of the functional deficits of the inner ear and the entire auditory system. Important individual factors to be considered when modeling the effect of sensorineural hearing impainnent on speech intelligibility are the audibility of the speech signal, different compressive properties, or different active processes in the inner ear. The latter two can be tenned supra-threshold factors, since they affect the processing of speech weil above the individual absolute threshold. It is not possible to directly (i. e. invasively) measure and study the influence of these supra-threshold factors on human speech recognition (HSR) for ethical reasons. However, computer models on HSR can provide an insight in how these factors may influence speech recognition performance. This dissertation presents 'a microscopic model of human speech recognition, microscopic in a sense that first, the recognition of single phonemes rather than the recognition of whole sentences is modeled. Second, the particular spectro-temporal structure of speech is processed in a way that is presumably very similar to the processing that takes place in the human auditory system. This contrasts with other models of HSR, which usually use the spectral structure only. This microscopic model is capable of predicting phoneme recognition in nonnal-hearing listeners in noise along with important aspects of consonant recognition in normal-hearing and hearing-impaired listeners in quiet condition. Furthennore, an extension of this model for the prediction of word recognition rates in whole German sentences is capable of predicting speech reception thresholds of normal-hearing and hearingimpaired listeners as accurately as a standard speech intelligibility model. Parameters reflecting the supra-threshold auditory processing are assessed in normal-hearing and hearingimpaired listeners using indirect psychoacoustical measurement techniques such as a forward masking experiment and categorical loudness scaling. Finally, the influence of including supra-threshold auditory processing deficits (assessed using the aforementioned measurement techniques) in modeling speech recognition is investigated, primarily realized as a loss in cochlear compression. The results show that implementing supra-threshold processing deticits (as found in hearing-impaired listeners) in a microscopic model of human speech recognition improves prediction accuracy. However, the advantage of taking these additional suprathreshold processing parameters into account is marginal in comparison to predicting speech intelligibility directly from audiometric data.