Modelling speech intelligibility in adverse conditions.

Søren Jørgensen, Torsten Dau

Research output: Chapter in Book/Report/Conference proceedingBook chapterResearchpeer-review

Abstract

Jørgensen and Dau (J Acoust Soc Am 130:1475-1487, 2011) proposed the speech-based envelope power spectrum model (sEPSM) in an attempt to overcome the limitations of the classical speech transmission index (STI) and speech intelligibility index (SII) in conditions with nonlinearly processed speech. Instead of considering the reduction of the temporal modulation energy as the intelligibility metric, as assumed in the STI, the sEPSM applies the signal-to-noise ratio in the envelope domain (SNRenv). This metric was shown to be the key for predicting the intelligibility of reverberant speech as well as noisy speech processed by spectral subtraction. The key role of the SNRenv metric is further supported here by the ability of a short-term version of the sEPSM to predict speech masking release for different speech materials and modulated interferers. However, the sEPSM cannot account for speech subjected to phase jitter, a condition in which the spectral structure of the intelligibility of speech signal is strongly affected, while the broadband temporal envelope is kept largely intact. In contrast, the effects of this distortion can be predicted -successfully by the spectro-temporal modulation index (STMI) (Elhilali et al., Speech Commun 41:331-348, 2003), which assumes an explicit analysis of the spectral "ripple" structure of the speech signal. However, since the STMI applies the same decision metric as the STI, it fails to account for spectral subtraction. The results from this study suggest that the SNRenv might reflect a powerful decision metric, while some explicit across-frequency analysis seems crucial in some conditions. How such across-frequency analysis is "realized" in the auditory system remains unresolved.
Original languageEnglish
Title of host publicationBasic Aspects of Hearing : Advances in Experimental Medicine and Biology
Volume787
PublisherSpringer
Publication date2013
Pages343-351
DOIs
Publication statusPublished - 2013
SeriesAdvances in Experimental Medicine and Biology
ISSN0065-2598

Fingerprint Dive into the research topics of 'Modelling speech intelligibility in adverse conditions.'. Together they form a unique fingerprint.

Cite this