Sources of Variability in Consonant Perception and Implications for Speech Perception Modeling

Johannes Zaar, Torsten Dau

    Research output: Chapter in Book/Report/Conference proceedingBook chapterResearchpeer-review

    383 Downloads (Pure)

    Abstract

    The  present  study  investigated  the  influence  of  various  sources  of response  variability  in  consonant  perception.  A  distinction  was  made  between source­induced variability and receiver­related variability. The former refers to perceptual differences induced by differences in the speech tokens and/or the masking noise tokens; the latter describes perceptual differences caused by within­ and across­listener uncertainty. Consonant­vowel combinations  (CVs) were presented to normal­hearing listeners in white noise at six different signal­to­noise ratios. The obtained responses were analyzed with respect to the considered sources of variability using a measure of the perceptual distance between responses. The largest effect was found across different CVs. For stimuli of the same phonetic identity, the speech­induced  variability  across  and  within talkers  and the  across­listener  variability were  substantial  and  of  similar magnitude. Even time­shifts in the  waveforms of white masking noise produced a significant effect, which was well above the within­listener  variability  (the  smallest effect). Two auditory­inspired models in combination with a template­matching back end were considered to predict the perceptual  data.  In  particular, an energy­based and a modulation­based approach were compared. The  suitability  of the two models was evaluated with  respect to the source­induced perceptual distance and in terms of consonant recognition rates and consonant confusions. Both models captured the source­induced perceptual distance
    remarkably well. However, the modulation­based approach showed a better agreement  with  the  data  in  terms  of  consonant  recognition  and  confusions.  The results indicate that low-frequency modulations up to 16 Hz play a crucial role in
    consonant perception.
    Original languageEnglish
    Title of host publicationPhysiology, Psychoacoustics and Cognition in Normal and Impaired Hearing
    EditorsP. van Dijk, D. Başkent, E. Gaudrain, E. de Kleine, A. Wagner, C. Lanting
    PublisherSpringer
    Publication date2016
    Pages437-446
    ISBN (Print)978-3-319-25472-2
    DOIs
    Publication statusPublished - 2016
    SeriesAdvances in Experimental Medicine and Biology
    Volume894
    ISSN0065-2598

    Bibliographical note

    © The Author(s) 2016.
    P. van Dijk et al. (eds.), Physiology, Psychoacoustics and Cognition in Normal  
    and Impaired Hearing, Advances in Experimental Medicine and Biology 894,  
    DOI 10.1007/978­3­319­25474­6_46

    Fingerprint

    Dive into the research topics of 'Sources of Variability in Consonant Perception and Implications for Speech Perception Modeling'. Together they form a unique fingerprint.

    Cite this