The role of auditory spectro-temporal modulation filtering and the decision metric for speech intelligibility prediction

Alexandre Chabot-Leclerc, Søren Jørgensen, Torsten Dau

Research output: Contribution to journalJournal articleResearchpeer-review

Abstract

Speech intelligibility models typically consist of a preprocessing part that transforms stimuli into some internal (auditory) representation and a decision metric that relates the internal representation to speech intelligibility. The present study analyzed the role of modulation filtering in the preprocessing of different speech intelligibility models by comparing predictions from models that either assume a spectro-temporal (i.e., two-dimensional) or a temporal-only (i.e., one-dimensional) modulation filterbank. Furthermore, the role of the decision metric for speech intelligibility was investigated by comparing predictions from models based on the signal-to-noise envelope power ratio, SNRenv, and the modulation transfer function, MTF. The models were evaluated in conditions of noisy speech (1) subjected to reverberation, (2) distorted by phase jitter, or (3) processed by noise reduction via spectral subtraction. The results suggested that a decision metric based on the
SNRenv may provide a more general basis for predicting speech intelligibility than a metric based on the MTF. Moreover, the one-dimensional modulation filtering process was found to be sufficient to account for the data when combined with a measure of across (audio) frequency variability at the output of the auditory preprocessing. A complex spectro-temporal modulation filterbank might therefore not be required for speech intelligibility prediction.
Original languageEnglish
JournalJournal of the Acoustical Society of America
Volume135
Issue number6
Pages (from-to)3502–3512
ISSN0001-4966
DOIs
Publication statusPublished - 2014

Fingerprint Dive into the research topics of 'The role of auditory spectro-temporal modulation filtering and the decision metric for speech intelligibility prediction'. Together they form a unique fingerprint.

Cite this