Multi-modal data generation with a deep metric variational autoencoder

Josefine Vilsbøll Sundgaard, Morten Rieger Hannemose, Søren Laugesen, James Harte, Yosuke Kamide, Chiemi Tanaka, Rasmus Reinhold Paulsen, Anders Nymark Christensen

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

7 Downloads (Pure)

Abstract

We present a deep metric variational autoencoder for multi-modal data generation. The variational autoencoder employs triplet loss in the latent space, which allows for conditional data generation by sampling new embeddings in the latent space within each class cluster. The approach is evaluated on a multi-modal dataset consisting of otoscopy images of the tympanic membrane with corresponding wideband tympanometry measurements. The modalities in this dataset are correlated, as they represent different aspects of the state of the middle ear, but they do not present a direct pixel-to-pixel correlation. The approach shows promising results for the conditional generation of pairs of images and tympanograms, and will allow for efficient data augmentation of data from multi-modal sources.
Original languageEnglish
Title of host publicationProceedings of the Northern Lights Deep Learning Workshop 2023
Number of pages9
Volume4
PublisherSeptentrio Academic Publishing
Publication date2023
DOIs
Publication statusPublished - 2023
Event Northern Lights Deep Learning Workshop 2023 - Tromsø, Norway
Duration: 10 Jan 202312 Jan 2023

Workshop

Workshop Northern Lights Deep Learning Workshop 2023
Country/TerritoryNorway
CityTromsø
Period10/01/202312/01/2023

Fingerprint

Dive into the research topics of 'Multi-modal data generation with a deep metric variational autoencoder'. Together they form a unique fingerprint.

Cite this