Semi-supervised source localization in reverberant environments using deep generative modeling

Michael J. Bianco, Sharon Gannot, Efren Fernandez Grande, Peter Gerstoft

    Research output: Contribution to journalConference abstract in journalResearchpeer-review

    81 Downloads (Pure)


    We present a method for acoustic source localization in reverberant environments based on semi-supervised machine learning (ML) with deep generative models. Source localization in the presence of reverberation remains a major challenge, which recent ML techniques have shown promise in addressing. Despite often large data volumes, the number of labels available for supervised learning in reverberant environments is usually small. In semi-supervised learning, ML systems are trained using many examples with only few labels, with the goal of exploiting the natural structure of the data. We use variational autoencoders (VAEs), which are generative neural networks (NNs) that rely on explicit probabilistic representations, to model the latent distribution of reverberant acoustic data. VAEs consist of an encoder NN, which maps complex input distributions to simpler parametric distributions (e.g., Gaussian), and a decoder NN which approximates the training examples. The VAE is trained to generate the phase of relative transfer functions (RTFs) between two microphones in reverberant environments, in parallel with a DOA classifier, on both labeled and unlabeled RTF samples. The performance this VAE-based approach is compared with conventional and ML-based localization in simulated and real-world scenarios.
    Original languageEnglish
    JournalJournal of the Acoustical Society of America
    Issue number4
    Pages (from-to)2662
    Number of pages1
    Publication statusPublished - 2020
    Event179th Meeting of the Acoustical Society of America - Online
    Duration: 7 Dec 202011 Dec 2020
    Conference number: 179


    Conference179th Meeting of the Acoustical Society of America
    OtherAcoustics Virtually Everywhere
    Internet address


    Dive into the research topics of 'Semi-supervised source localization in reverberant environments using deep generative modeling'. Together they form a unique fingerprint.

    Cite this