Audio-visual scene analysis in reverberant multi-talker environments

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

25 Downloads (Pure)

Abstract

Normal-hearing subjects are accurate in localizing sound sources even in reverberant multi-talker environments (e.g., Kopčo, 2010; Weller, 2016). Weller et al. (2016) showed that subjects can accurately analyse reverberant multi-talker scenes with up to four simultaneous talkers. While multi-talker scene analysis has mainly been investigated with only auditory information, the addition of visual information might influence the subjects’ perception. To investigate the visual influence, audio-visual scenes with a varying number of talkers and degrees of reverberation were considered in the present study. The acoustic information was provided using a spherical loudspeaker array and the visual information was provided using head-tracked virtual reality glasses. The visual information represented various possible talker locations and the subjects were asked to identify the number of talkers and their specific locations. For the identification of talkers, subjects had to label visual locations with headlines from the talker’s speech topic. It was hypothesized that the addition of visual information improves subjects’ ability to analyse complex auditory scenes, while the amount of reverberation impairs the overall performance.

Original languageEnglish
Title of host publicationProceedings of the 23rd International Congress on Acoustics
PublisherDeutsche Gesellschaft für Akustik e.V.
Publication date2019
Pages3890-3896
ISBN (Print)978-3-939296-15-7
Publication statusPublished - 2019
Event23rd International Congress on Acoustics - Eurogress, Aachen , Germany
Duration: 9 Sep 201913 Sep 2019
http://www.ica2019.org/

Conference

Conference23rd International Congress on Acoustics
LocationEurogress
CountryGermany
CityAachen
Period09/09/201913/09/2019
Internet address

Keywords

  • Auditory Scene Analysis
  • Speech Perception
  • Virtual Reality

Cite this

Ahrens, A., Lund, K. D., & Dau, T. (2019). Audio-visual scene analysis in reverberant multi-talker environments. In Proceedings of the 23rd International Congress on Acoustics (pp. 3890-3896). Deutsche Gesellschaft für Akustik e.V..
Ahrens, Axel ; Lund, Kasper Duemose ; Dau, Torsten. / Audio-visual scene analysis in reverberant multi-talker environments. Proceedings of the 23rd International Congress on Acoustics. Deutsche Gesellschaft für Akustik e.V., 2019. pp. 3890-3896
@inproceedings{557834dbe93a4c45bec9cc99174bb327,
title = "Audio-visual scene analysis in reverberant multi-talker environments",
abstract = "Normal-hearing subjects are accurate in localizing sound sources even in reverberant multi-talker environments (e.g., Kopčo, 2010; Weller, 2016). Weller et al. (2016) showed that subjects can accurately analyse reverberant multi-talker scenes with up to four simultaneous talkers. While multi-talker scene analysis has mainly been investigated with only auditory information, the addition of visual information might influence the subjects’ perception. To investigate the visual influence, audio-visual scenes with a varying number of talkers and degrees of reverberation were considered in the present study. The acoustic information was provided using a spherical loudspeaker array and the visual information was provided using head-tracked virtual reality glasses. The visual information represented various possible talker locations and the subjects were asked to identify the number of talkers and their specific locations. For the identification of talkers, subjects had to label visual locations with headlines from the talker’s speech topic. It was hypothesized that the addition of visual information improves subjects’ ability to analyse complex auditory scenes, while the amount of reverberation impairs the overall performance.",
keywords = "Auditory Scene Analysis, Speech Perception, Virtual Reality",
author = "Axel Ahrens and Lund, {Kasper Duemose} and Torsten Dau",
year = "2019",
language = "English",
isbn = "978-3-939296-15-7",
pages = "3890--3896",
booktitle = "Proceedings of the 23rd International Congress on Acoustics",
publisher = "Deutsche Gesellschaft f{\"u}r Akustik e.V.",

}

Ahrens, A, Lund, KD & Dau, T 2019, Audio-visual scene analysis in reverberant multi-talker environments. in Proceedings of the 23rd International Congress on Acoustics. Deutsche Gesellschaft für Akustik e.V., pp. 3890-3896, 23rd International Congress on Acoustics , Aachen , Germany, 09/09/2019.

Audio-visual scene analysis in reverberant multi-talker environments. / Ahrens, Axel; Lund, Kasper Duemose; Dau, Torsten.

Proceedings of the 23rd International Congress on Acoustics. Deutsche Gesellschaft für Akustik e.V., 2019. p. 3890-3896.

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

TY - GEN

T1 - Audio-visual scene analysis in reverberant multi-talker environments

AU - Ahrens, Axel

AU - Lund, Kasper Duemose

AU - Dau, Torsten

PY - 2019

Y1 - 2019

N2 - Normal-hearing subjects are accurate in localizing sound sources even in reverberant multi-talker environments (e.g., Kopčo, 2010; Weller, 2016). Weller et al. (2016) showed that subjects can accurately analyse reverberant multi-talker scenes with up to four simultaneous talkers. While multi-talker scene analysis has mainly been investigated with only auditory information, the addition of visual information might influence the subjects’ perception. To investigate the visual influence, audio-visual scenes with a varying number of talkers and degrees of reverberation were considered in the present study. The acoustic information was provided using a spherical loudspeaker array and the visual information was provided using head-tracked virtual reality glasses. The visual information represented various possible talker locations and the subjects were asked to identify the number of talkers and their specific locations. For the identification of talkers, subjects had to label visual locations with headlines from the talker’s speech topic. It was hypothesized that the addition of visual information improves subjects’ ability to analyse complex auditory scenes, while the amount of reverberation impairs the overall performance.

AB - Normal-hearing subjects are accurate in localizing sound sources even in reverberant multi-talker environments (e.g., Kopčo, 2010; Weller, 2016). Weller et al. (2016) showed that subjects can accurately analyse reverberant multi-talker scenes with up to four simultaneous talkers. While multi-talker scene analysis has mainly been investigated with only auditory information, the addition of visual information might influence the subjects’ perception. To investigate the visual influence, audio-visual scenes with a varying number of talkers and degrees of reverberation were considered in the present study. The acoustic information was provided using a spherical loudspeaker array and the visual information was provided using head-tracked virtual reality glasses. The visual information represented various possible talker locations and the subjects were asked to identify the number of talkers and their specific locations. For the identification of talkers, subjects had to label visual locations with headlines from the talker’s speech topic. It was hypothesized that the addition of visual information improves subjects’ ability to analyse complex auditory scenes, while the amount of reverberation impairs the overall performance.

KW - Auditory Scene Analysis

KW - Speech Perception

KW - Virtual Reality

M3 - Article in proceedings

SN - 978-3-939296-15-7

SP - 3890

EP - 3896

BT - Proceedings of the 23rd International Congress on Acoustics

PB - Deutsche Gesellschaft für Akustik e.V.

ER -

Ahrens A, Lund KD, Dau T. Audio-visual scene analysis in reverberant multi-talker environments. In Proceedings of the 23rd International Congress on Acoustics. Deutsche Gesellschaft für Akustik e.V. 2019. p. 3890-3896