View graph of relations

A model for predicting the intelligibility of processed noisy speech is proposed. The speech-based envelope power spectrum model has a similar structure as the model of Ewert and Dau [(2000). J. Acoust. Soc. Am. 108, 1181-1196], developed to account for modulation detection and masking data. The model estimates the speech-to-noise envelope power ratio, SNR env, at the output of a modulation filterbank and relates this metric to speech intelligibility using the concept of an ideal observer. Predictions were compared to data on the intelligibility of speech presented in stationary speech-shaped noise. The model was further tested in conditions with noisy speech subjected to reverberation and spectral subtraction. Good agreement between predictions and data was found in all cases. For spectral subtraction, an analysis of the model's internal representation of the stimuli revealed that the predicted decrease of intelligibility was caused by the estimated noise envelope power exceeding that of the speech. The classical concept of the speech transmission index fails in this condition. The results strongly suggest that the signal-to-noise ratio at the output of a modulation frequency selective process provides a key measure of speech intelligibility. © 2011 Acoustical Society of America.
Original languageEnglish
JournalJournal of the Acoustical Society of America
Publication date2011
Volume130
Issue3
Pages1475-1487
ISSN00014966
DOIs
StatePublished
CitationsWeb of Science® Times Cited: 19

Keywords

  • Speech intelligibility, Reverberation, Acoustic noise, Voice communication, Acoustic signal processing
Download as:
Download as PDF
Select render style:
APAAuthorCBEHarvardMLAStandardVancouverShortLong
PDF
Download as HTML
Select render style:
APAAuthorCBEHarvardMLAStandardVancouverShortLong
HTML
Download as Word
Select render style:
APAAuthorCBEHarvardMLAStandardVancouverShortLong
Word

ID: 5790081