Abstract
This paper presents a novel machine-hearing system that exploits
deep neural networks (DNNs) and head movements for binaural
localisation of multiple speakers in reverberant conditions. DNNs
are used to map binaural features, consisting of the complete crosscorrelation
function (CCF) and interaural level differences (ILDs),
to the source azimuth. Our approach was evaluated using a localisation
task in which sources were located in a full 360-degree azimuth
range. As a result, front-back confusions often occurred due
to the similarity of binaural features in the front and rear hemifields.
To address this, a head movement strategy was incorporated in the
DNN-based model to help reduce the front-back errors. Our experiments
show that, compared to a system based on a Gaussian mixture
model (GMM) classifier, the proposed DNN system substantially
reduces localisation errors under challenging acoustic scenarios in
which multiple speakers and room reverberation are present.
Original language | English |
---|---|
Title of host publication | Proceedings of Interspeech 2015 |
Number of pages | 5 |
Publisher | ISCA |
Publication date | 2015 |
Pages | 3302-3306 |
Publication status | Published - 2015 |
Event | INTERSPEECH 2015 : Speech beyond Speech - Dresden, Germany Duration: 6 Sept 2015 → 10 Sept 2015 |
Conference
Conference | INTERSPEECH 2015 |
---|---|
Country/Territory | Germany |
City | Dresden |
Period | 06/09/2015 → 10/09/2015 |
Keywords
- Binaural source localisation
- Deep neural networks
- Head movements
- Machine hearing
- Reverberation