Abstract
We present a method for acoustic source localization in reverberant environments based on semi-supervised machine learning (ML) with deep generative models. Source localization in the presence of reverberation remains a major challenge, which recent ML techniques have shown promise in addressing. Despite often large data volumes, the number of labels available for supervised learning in reverberant environments is usually small. In semi-supervised learning, ML systems are trained using many examples with only few labels, with the goal of exploiting the natural structure of the data. We use variational autoencoders (VAEs), which are generative neural networks (NNs) that rely on explicit probabilistic representations, to model the latent distribution of reverberant acoustic data. VAEs consist of an encoder NN, which maps complex input distributions to simpler parametric distributions (e.g., Gaussian), and a decoder NN which approximates the training examples. The VAE is trained to generate the phase of relative transfer functions (RTFs) between two microphones in reverberant environments, in parallel with a DOA classifier, on both labeled and unlabeled RTF samples. The performance this VAE-based approach is compared with conventional and ML-based localization in simulated and real-world scenarios.
Original language | English |
---|---|
Journal | Journal of the Acoustical Society of America |
Volume | 148 |
Issue number | 4 |
Pages (from-to) | 2662 |
Number of pages | 1 |
ISSN | 0001-4966 |
DOIs | |
Publication status | Published - 2020 |
Event | 179th Meeting of the Acoustical Society of America - Online Duration: 7 Dec 2020 → 11 Dec 2020 Conference number: 179 https://acousticalsociety.org/179th-meeting/ |
Conference
Conference | 179th Meeting of the Acoustical Society of America |
---|---|
Number | 179 |
Location | Online |
Period | 07/12/2020 → 11/12/2020 |
Other | Acoustics Virtually Everywhere |
Internet address |