Characterizing auditory and audio-visual perception in virtual environments

Research output: Book/ReportPh.D. thesisResearch

16 Downloads (Pure)

Abstract

One of the challenges in hearing research is to explain the human ability to understand speech in complex, noisy environments, commonly referred to as a cocktail-party scenario. To gain a better understanding of how the auditory system performs in complex acoustic environments, one approach is to reproduce such listening situations in the laboratory. By applying spatial audio reproduction techniques, sound fields can be reproduced, which may be well-suited for bringing more realistic sound scenes into the laboratory. However, physical limitations affect the reproduction methods and might also affect perception. In addition to acoustic information, auditory perception can be influenced by visual information. Virtual reality glasses might be a promising tool to add visual information to virtual acoustic scenarios. However, a perceptual characterization of virtual audio-visual reproductions is lacking.
This thesis focused on three aspects related to the perception in virtual auditory and audio-visual environments: (i) The accuracy of the reproduction of a virtual acoustic room in terms of speech intelligibility, (ii) the relation between the source size and speech intelligibility, and (iii) the role of visual information and the impact of virtual reality glasses on sound localization. It is demonstrated that the acoustic reproduction based on impulse responses measured with a microphone array provides the closest match to a reverberant reference room in terms of speech intelligibility, while a reproduction based on room acoustic simulations shows significantly different results as compared to a reference room. The differences in speech intelligibility can be accounted for by using a computational speech intelligibility model. Furthermore, it is shown that speech intelligibility is worse in conditions where the energy of a target and an interfering speech is spatially spread in comparison to point-like sources. The relationship between the energy spread and speech intelligibility can be described with a computational model that utilizes a better-ear listening strategy. Finally, it is demonstrated that virtual reality glasses disturb the acoustic field around the head which can decrease the sound localization accuracy. When virtual visual information is presented, the sound source localization accuracy improves to a comparable extent as it has been shown in realistic environments.
Overall, throughout this thesis, it is shown that virtual reality glasses and loudspeaker-based virtual sound environments represent powerful tools for the reproduction of realistic scenarios and contribute to a better understanding of auditory processing and perception in cocktail party-like scenarios.
Original languageEnglish
PublisherDTU Health Technology
Number of pages129
Publication statusPublished - 2019

Cite this

@phdthesis{ddb94d4091f34702a72ea3bae488227b,
title = "Characterizing auditory and audio-visual perception in virtual environments",
abstract = "One of the challenges in hearing research is to explain the human ability to understand speech in complex, noisy environments, commonly referred to as a cocktail-party scenario. To gain a better understanding of how the auditory system performs in complex acoustic environments, one approach is to reproduce such listening situations in the laboratory. By applying spatial audio reproduction techniques, sound fields can be reproduced, which may be well-suited for bringing more realistic sound scenes into the laboratory. However, physical limitations affect the reproduction methods and might also affect perception. In addition to acoustic information, auditory perception can be influenced by visual information. Virtual reality glasses might be a promising tool to add visual information to virtual acoustic scenarios. However, a perceptual characterization of virtual audio-visual reproductions is lacking.This thesis focused on three aspects related to the perception in virtual auditory and audio-visual environments: (i) The accuracy of the reproduction of a virtual acoustic room in terms of speech intelligibility, (ii) the relation between the source size and speech intelligibility, and (iii) the role of visual information and the impact of virtual reality glasses on sound localization. It is demonstrated that the acoustic reproduction based on impulse responses measured with a microphone array provides the closest match to a reverberant reference room in terms of speech intelligibility, while a reproduction based on room acoustic simulations shows significantly different results as compared to a reference room. The differences in speech intelligibility can be accounted for by using a computational speech intelligibility model. Furthermore, it is shown that speech intelligibility is worse in conditions where the energy of a target and an interfering speech is spatially spread in comparison to point-like sources. The relationship between the energy spread and speech intelligibility can be described with a computational model that utilizes a better-ear listening strategy. Finally, it is demonstrated that virtual reality glasses disturb the acoustic field around the head which can decrease the sound localization accuracy. When virtual visual information is presented, the sound source localization accuracy improves to a comparable extent as it has been shown in realistic environments. Overall, throughout this thesis, it is shown that virtual reality glasses and loudspeaker-based virtual sound environments represent powerful tools for the reproduction of realistic scenarios and contribute to a better understanding of auditory processing and perception in cocktail party-like scenarios.",
author = "Axel Ahrens",
year = "2019",
language = "English",
publisher = "DTU Health Technology",

}

Characterizing auditory and audio-visual perception in virtual environments. / Ahrens, Axel.

DTU Health Technology, 2019. 129 p.

Research output: Book/ReportPh.D. thesisResearch

TY - BOOK

T1 - Characterizing auditory and audio-visual perception in virtual environments

AU - Ahrens, Axel

PY - 2019

Y1 - 2019

N2 - One of the challenges in hearing research is to explain the human ability to understand speech in complex, noisy environments, commonly referred to as a cocktail-party scenario. To gain a better understanding of how the auditory system performs in complex acoustic environments, one approach is to reproduce such listening situations in the laboratory. By applying spatial audio reproduction techniques, sound fields can be reproduced, which may be well-suited for bringing more realistic sound scenes into the laboratory. However, physical limitations affect the reproduction methods and might also affect perception. In addition to acoustic information, auditory perception can be influenced by visual information. Virtual reality glasses might be a promising tool to add visual information to virtual acoustic scenarios. However, a perceptual characterization of virtual audio-visual reproductions is lacking.This thesis focused on three aspects related to the perception in virtual auditory and audio-visual environments: (i) The accuracy of the reproduction of a virtual acoustic room in terms of speech intelligibility, (ii) the relation between the source size and speech intelligibility, and (iii) the role of visual information and the impact of virtual reality glasses on sound localization. It is demonstrated that the acoustic reproduction based on impulse responses measured with a microphone array provides the closest match to a reverberant reference room in terms of speech intelligibility, while a reproduction based on room acoustic simulations shows significantly different results as compared to a reference room. The differences in speech intelligibility can be accounted for by using a computational speech intelligibility model. Furthermore, it is shown that speech intelligibility is worse in conditions where the energy of a target and an interfering speech is spatially spread in comparison to point-like sources. The relationship between the energy spread and speech intelligibility can be described with a computational model that utilizes a better-ear listening strategy. Finally, it is demonstrated that virtual reality glasses disturb the acoustic field around the head which can decrease the sound localization accuracy. When virtual visual information is presented, the sound source localization accuracy improves to a comparable extent as it has been shown in realistic environments. Overall, throughout this thesis, it is shown that virtual reality glasses and loudspeaker-based virtual sound environments represent powerful tools for the reproduction of realistic scenarios and contribute to a better understanding of auditory processing and perception in cocktail party-like scenarios.

AB - One of the challenges in hearing research is to explain the human ability to understand speech in complex, noisy environments, commonly referred to as a cocktail-party scenario. To gain a better understanding of how the auditory system performs in complex acoustic environments, one approach is to reproduce such listening situations in the laboratory. By applying spatial audio reproduction techniques, sound fields can be reproduced, which may be well-suited for bringing more realistic sound scenes into the laboratory. However, physical limitations affect the reproduction methods and might also affect perception. In addition to acoustic information, auditory perception can be influenced by visual information. Virtual reality glasses might be a promising tool to add visual information to virtual acoustic scenarios. However, a perceptual characterization of virtual audio-visual reproductions is lacking.This thesis focused on three aspects related to the perception in virtual auditory and audio-visual environments: (i) The accuracy of the reproduction of a virtual acoustic room in terms of speech intelligibility, (ii) the relation between the source size and speech intelligibility, and (iii) the role of visual information and the impact of virtual reality glasses on sound localization. It is demonstrated that the acoustic reproduction based on impulse responses measured with a microphone array provides the closest match to a reverberant reference room in terms of speech intelligibility, while a reproduction based on room acoustic simulations shows significantly different results as compared to a reference room. The differences in speech intelligibility can be accounted for by using a computational speech intelligibility model. Furthermore, it is shown that speech intelligibility is worse in conditions where the energy of a target and an interfering speech is spatially spread in comparison to point-like sources. The relationship between the energy spread and speech intelligibility can be described with a computational model that utilizes a better-ear listening strategy. Finally, it is demonstrated that virtual reality glasses disturb the acoustic field around the head which can decrease the sound localization accuracy. When virtual visual information is presented, the sound source localization accuracy improves to a comparable extent as it has been shown in realistic environments. Overall, throughout this thesis, it is shown that virtual reality glasses and loudspeaker-based virtual sound environments represent powerful tools for the reproduction of realistic scenarios and contribute to a better understanding of auditory processing and perception in cocktail party-like scenarios.

M3 - Ph.D. thesis

BT - Characterizing auditory and audio-visual perception in virtual environments

PB - DTU Health Technology

ER -