Abstract
This study presents a novel solution to the problem of binaural localization of a speaker in the presence of interfering directional noise and reverberation. Using a state-of-the-art binaural localization algorithm based on a deep neural network (DNN), we propose adding a source separation stage based on non-negative matrix factorization (NMF) to improve the localization performance in conditions with interfering sources. The separation stage is coupled with the localization stage and is optimized with respect to a broad range of different acoustic conditions, emphasizing a robust and generalizable solution. The machine listening system is shown to greatly benefit from the NMF-based separation stage at low target-to-masker ratios (TMRs) for a variety of noise types, especially for non-stationary noise. It is also demonstrated that training the NMF algorithm on anechoic speech provides better performance than using reverberant speech, and that optimizing the source separation stage using a localization metric rather than a source separation metric substantially increases the system performance.
Original language | English |
---|---|
Title of host publication | Proceedings of 2021 IEEE International Conference on Acoustics, Speech and Signal Processing |
Publisher | IEEE |
Publication date | 2021 |
Pages | 221-225 |
ISBN (Electronic) | 978-1-7281-7605-5 |
DOIs | |
Publication status | Published - 2021 |
Event | 2021 IEEE International Conference on Acoustics, Speech and Signal Processing - Virtual event, Toronto, Canada Duration: 6 Jun 2021 → 11 Jun 2021 Conference number: 46 https://www.2021.ieeeicassp.org/2021.ieeeicassp.org/index.html |
Conference
Conference | 2021 IEEE International Conference on Acoustics, Speech and Signal Processing |
---|---|
Number | 46 |
Location | Virtual event |
Country/Territory | Canada |
City | Toronto |
Period | 06/06/2021 → 11/06/2021 |
Internet address |