A correlation metric in the envelope power spectrum domain for speech intelligibility prediction

  • Helia Relano Iborra (Guest lecturer)

    Activity: Talks and presentationsConference presentations


    A speech intelligibility model, named sEPSMcorr, is presented, which uses a modulation-frequency selective processing based on the (multi-resolution) speech-based envelope power spectrum model (mr-sEPSM; Jørgensen et al. 2013) in combination with a cross-correlation based back end inspired by the short-time objective intelligibility measure (STOI; Taal et al., 2011). The model can accurately predict data obtained with normal-hearing (NH) listeners for a broad range of listening conditions, including effects of stationary and fluctuating additive interferers as well as effects of non-linear distortions, such as spectral subtraction, phase jitter and ideal binary mask (IBM) processing. The model has a larger predictive power than both the original mr-sEPSM (which fails in the phase-jitter and IBM conditions) and STOI (which fails to predict the influence of fluctuating interferers).
    However the sEPSMcorr preprocessing does not provide a flexible framework to predict individual speech intelligibility data from hearing impaired listeners. Thus, the back end of the sEPSMcorr was combined with a more realistic auditory pre-processing front end adopted from the computational auditory signal processing and perception model (CASP; Jepsen et al., 2008). The preprocessing contains outer- and middle-ear filtering and a non-linear auditory filterbank (DRNL, López-Poveda and Meddis, 2001), followed by inner hair-cell transduction, adaptation and a modulation filterbank.
    The predictions of the sEPSM-based and the CASP-based models were compared with respect to measured data (NH) in conditions of additive masking noise, phase jitter distortions, reverberation and noise-reduction algorithms. The effects of the back end as well as the different preprocessing stages on the predicted results were analyzed. The resulting modelling framework could be useful for the design and evaluation of, e.g. speech transmission algorithms or hearing-instrument algorithms.
    Event titleARCHES/ICANHEAR 2016: Audiological Research Cores in Europe (ARCHES) meeting and Improved Communication through Applied Hearing Research (ICanHear) conference
    Event typeConference
    LocationZurich, SwitzerlandShow on map