Abstract
The speech-based envelope power spectrum model (sEPSM) presented by Jørgensen and Dau [(2011). J. Acoust. Soc. Am. 130, 1475-1487] estimates the envelope signal-to-noise ratio (SNRenv) after modulation-frequency selective processing. This approach accurately predicts the speech intelligibility for normal-hearing listeners in conditions with additive stationary noise, reverberation, and nonlinear processing with spectral subtraction. The latter condition represents a case in which the standardized speech intelligibility index and the speech transmission index fail. However, the sEPSM is limited to conditions with stationary interferers due to the long-term estimation of the envelope power and cannot account for the well-known phenomenon of speech masking release. Here, a short-term version of the sEPSM is described [Jørgensen and Dau, 2012, in preparation], which estimates the SNRenv in short temporal segments. Predictions obtained with the short-term sEPSM are compared to data from Kjems et al. [(2009). J. Acoust. Soc. Am. 126 (3), 1415-1426] where speech is mixed with four different interferers, including speech-shaped noise, bottle noise, car noise, and a highly non-stationary cafe noise. The model accounts well for the differences in intelligibility observed for the stationary and non-stationary interferers, demonstrating further that the SNRenv is crucial for speech comprehension.
Original language | English |
---|---|
Title of host publication | Proceedings of Acoustics 2012 |
Number of pages | 6 |
Publication date | 2012 |
Publication status | Published - 2012 |
Event | Acoustics 2012 Hong Kong - Hong Kong Convention and Exhibition , Hong Kong, Hong Kong Duration: 13 May 2012 → 18 May 2012 |
Conference
Conference | Acoustics 2012 Hong Kong |
---|---|
Location | Hong Kong Convention and Exhibition |
Country/Territory | Hong Kong |
City | Hong Kong |
Period | 13/05/2012 → 18/05/2012 |