When attending to a speech source in acoustic environments with many talkers, low-frequency activity in auditory cortex is known to be selectively synchronized with slow amplitude fluctuation in the attended speech signal. In everyday communication, a listener can typically also see the face of the attended talker, but it remains unclear how attention-driven speech processing is influenced by visual information. Here, we investigated the impact of visual information on a closed-loop system that decodes the attended talker from scalp EEG and then amplifies the acoustic speech signal of that talker. To decode attention in real-time from scalp EEG, we used canonical correlation analysis (CCA) in order to relate multichannel EEG to a model of the audio-visual (AV) speech stimulus. First, we investigated a model of the temporal envelope of the acoustic speech signals passed through a modulation filtering stage mimicking the auditory midbrain. We found higher attention decoding accuracy and faster attention switching of the closed-loop system for listeners trained with audio-visual speech, compared to listeners presented with only audio. We also observed an earlier response to the acoustic envelope with audio-visual speech compared to audio-only speech. Next, we found that the attended talker could be decoded based on a CCA-model of visual features alone, using a measure of optical flow. Finally, combining audio and visual features in a CCA model improved accuracy further compared to models based on either auditory or visual features alone.
|Publication status||Published - 2019|
|Event||Bernstein Conference 2019 - Berlin, Germany|
Duration: 17 Sep 2019 → 20 Sep 2019
|Conference||Bernstein Conference 2019|
|Period||17/09/2019 → 20/09/2019|