Exploiting deep neural networks and head movements for binaural localisation of multiple speakers in reverberant conditions

Ning Ma, Guy J. Brown, Tobias May

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

278 Downloads (Pure)

Abstract

This paper presents a novel machine-hearing system that exploits deep neural networks (DNNs) and head movements for binaural localisation of multiple speakers in reverberant conditions. DNNs are used to map binaural features, consisting of the complete crosscorrelation function (CCF) and interaural level differences (ILDs), to the source azimuth. Our approach was evaluated using a localisation task in which sources were located in a full 360-degree azimuth range. As a result, front-back confusions often occurred due to the similarity of binaural features in the front and rear hemifields. To address this, a head movement strategy was incorporated in the DNN-based model to help reduce the front-back errors. Our experiments show that, compared to a system based on a Gaussian mixture model (GMM) classifier, the proposed DNN system substantially reduces localisation errors under challenging acoustic scenarios in which multiple speakers and room reverberation are present.
Original languageEnglish
Title of host publicationProceedings of Interspeech 2015
Number of pages5
PublisherISCA
Publication date2015
Pages3302-3306
Publication statusPublished - 2015
EventINTERSPEECH 2015 : Speech beyond Speech - Dresden, Germany
Duration: 6 Sep 201510 Sep 2015

Conference

ConferenceINTERSPEECH 2015
CountryGermany
CityDresden
Period06/09/201510/09/2015

Keywords

  • Binaural source localisation
  • Deep neural networks
  • Head movements
  • Machine hearing
  • Reverberation

Fingerprint

Dive into the research topics of 'Exploiting deep neural networks and head movements for binaural localisation of multiple speakers in reverberant conditions'. Together they form a unique fingerprint.

Cite this