Publication: Research - peer-review › Article in proceedings – Annual report year: 2009
The spectro-temporal coding of Danish consonants was investigated using an information-theoretic approach. Listeners were asked to identify eleven different consonants spoken in a CV[l] syllable context (where C refers to the initial consonant, V refers to one of three vowels, [I, a, u], and [l] refers to the syllable-final liquid segment). Each syllable was processed so that only a portion of the original audio spectrum was present. Narrow (three-quarter octave) bands of speech, with center frequencies of 750 Hz, 1500 Hz and 3000 Hz, were presented individually and in combination with each other. The modulation spectrum of each band was low-pass filtered at 24, 12, 6 and 3 Hz. Confusion matrices of the consonant-identification data were computed, and from these the amount of information transmitted for each of three phonetic feature dimensions – voicing, manner and place of articulation – was calculated for each condition. This form of analysis provides a simple means of determining whether information associated with each phonetic feature dimension combines linearly across the audio spectrum, and, if not, delineates a method for characterizing the (non-linear) nature of information integration. In addition, the analysis provides a means to associate specific portions of the modulation spectrum with phonetic feature properties. Such analyses indicate that: (1) Accurate, robust decoding of place-of-articulation information requires broadband cross-spectral integration (2) Place-of-articulation information is associated most closely with the modulation spectrum above 6 Hz, with the most significant contribution coming from the region above 12 Hz. (3) Place-of-articulation information is crucial for accurate consonant recognition. Hence, consonant decoding requires cross-spectral integration of the modulation spectrum above 8 Hz. (4) Voicing is mainly associated with the modulation spectrum between 3 and 6 Hz (with a smaller contribution made by the region above 12 Hz). (5) Manner of articulation is most closely associated with the portion of the modulation spectrum above 12 Hz. This form of information-theoretic analysis can be used to delineate those parts of the speech signal of greatest importance for encoding phonetic features associated with intelligibility and speech understanding.
|Title of host publication||Linguistic Theory Raw Sound|
|Number of pages||260|
|State||Published - 2009|
|Conference||Linguistic Theory and Raw Sound|
|Period||01/01/2009 → …|
|Name||Copenhagen Studies in Language|
Loading map data...