Exploiting Non-Negative Matrix Factorization for Binaural Sound Localization in the Presence of Directional Interference

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

12 Downloads (Pure)

Abstract

This study presents a novel solution to the problem of binaural localization of a speaker in the presence of interfering directional noise and reverberation. Using a state-of-the-art binaural localization algorithm based on a deep neural network (DNN), we propose adding a source separation stage based on non-negative matrix factorization (NMF) to improve the localization performance in conditions with interfering sources. The separation stage is coupled with the localization stage and is optimized with respect to a broad range of different acoustic conditions, emphasizing a robust and generalizable solution. The machine listening system is shown to greatly benefit from the NMF-based separation stage at low target-to-masker ratios (TMRs) for a variety of noise types, especially for non-stationary noise. It is also demonstrated that training the NMF algorithm on anechoic speech provides better performance than using reverberant speech, and that optimizing the source separation stage using a localization metric rather than a source separation metric substantially increases the system performance.
Original languageEnglish
Title of host publicationProceedings of 2021 IEEE International Conference on Acoustics, Speech and Signal Processing
PublisherIEEE
Publication date2021
Pages221-225
ISBN (Electronic)978-1-7281-7605-5
DOIs
Publication statusPublished - 2021
Event2021 IEEE International Conference on Acoustics, Speech and Signal Processing - Metro Toronto Convention Centre, Toronto, Canada
Duration: 6 Jun 202111 Jun 2021

Conference

Conference2021 IEEE International Conference on Acoustics, Speech and Signal Processing
LocationMetro Toronto Convention Centre
Country/TerritoryCanada
CityToronto
Period06/06/202111/06/2021

Fingerprint

Dive into the research topics of 'Exploiting Non-Negative Matrix Factorization for Binaural Sound Localization in the Presence of Directional Interference'. Together they form a unique fingerprint.

Cite this