Consonant-Order Reversals in the McGurk Combination Illusion

Juan Camilo Gil Carvajal, Torsten Dau, Tobias Andersen

Humans can integrate auditory and visual information when perceiving speech. This is
evident in the McGurk effect, in which a presentation of e.g. auditory /aba/ and visual /aga/
leads to the audiovisually fused percept /ada/. With the pairing of auditory /aga/ and visual
/aba/, however, the illusion takes the form of a combination percept of either /abga/ or /agba/.
Here, we investigated how audiovisual timing influences the perceived order of the
consonants in the McGurk combination. Stimuli were recorded with the consonants /g/ and
/b/ using vowel-consonant-vowel (VCV) utterances with two syllabic contexts. First, the
“internal timing” was studied by articulating the consonant to either emphasize the closing
phase (VC-V) or the opening phase (V-CV). This produced cross-modally asynchronous
consonants while maintaining synchrony of the vowels. Auditory /ag_a/ dubbed onto visual
/a_ba/ was mostly heard as /agba/ whereas auditory /a_ga/ dubbed onto visual /ab_a/ was
mostly heard as /abga/. Hence, syllabic context largely determined the perceived consonant
order. Second, the effect of audiovisual stimulus onset asynchrony (SOA) was examined at
five different SOAs, ranging from 200 ms auditory lead to 200 ms visual lead. The results
showed no effect on the perceived consonant order but audiovisual SOAs influenced the
strength of the illusion. Furthermore, we found that the window of integration is highly
asymmetric for combination illusions and that the direction of the asymmetry depends on
the perceived consonant order. We interpret the results as indicative of feature based
audiovisual integration where formant transitions and aspirations are integrated separately.
