Effects of manipulating the signal-to-noise envelope power ratio on speech intelligibility

Søren Jørgensen, Remi Julien Blaise Decorsière, Torsten Dau

    Research output: Contribution to journalJournal articleResearchpeer-review

    Abstract

    Jørgensen and Dau [(2011). J. Acoust. Soc. Am. 130, 1475–1487] suggested a metric for speech intelligibility prediction based on the signal-to-noise envelope power ratio (SNRenv), calculated at the output of a modulation-frequency selective process. In the framework of the speech-based envelope power spectrum model (sEPSM), the SNRenv was demonstrated to account for speech intelligibility data in various conditions with linearly and nonlinearly processed noisy speech, as well as for conditions with stationary and fluctuating interferers. Here, the relation between the SNRenv and speech intelligibility was investigated further by systematically varying the modulation power of either the speech or the noise before mixing the two components, while keeping the overall power ratio of the two components constant. A good correspondence between the data and the corresponding sEPSM predictions was obtained when the noise was manipulated and mixed with the unprocessed speech, consistent with the hypothesis that SNRenv is indicative of speech intelligibility. However, discrepancies between data and predictions occurred for conditions where the speech was manipulated and the noise left untouched. In these conditions, distortions introduced by the applied modulation processing were detrimental for speech intelligibility, but not reflected in the SNRenv metric, thus representing a limitation of the modeling framework.
    Original languageEnglish
    JournalJournal of the Acoustical Society of America
    Volume137
    Issue number3
    Pages (from-to)1401–1410
    ISSN0001-4966
    DOIs
    Publication statusPublished - 2015

    Fingerprint

    Dive into the research topics of 'Effects of manipulating the signal-to-noise envelope power ratio on speech intelligibility'. Together they form a unique fingerprint.

    Cite this