LAVAE: Disentangling Location and Appearance

Andrea Dittadi, Ole Winther

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

18 Downloads (Orbit)

Abstract

We propose a probabilistic generative model for unsupervised learning of structured, interpretable, object-based representations of visual scenes. We use amortized variational inference to train the generative model end-to-end. The learned representations of object location and appearance are fully disentangled, and objects are represented independently of each other in the latent space. Unlike previous approaches that disentangle location and appearance, ours generalizes seamlessly to scenes with many more objects than encountered in the training regime. We evaluate the proposed model on multi-MNIST and multi-dSprites data sets.
Original languageEnglish
Title of host publicationProceedings of Workshop on Perception as Generative Reasoning
Number of pages11
Publication date2019
Publication statusPublished - 2019
Event33rd Conference on Neural Information Processing Systems - Vancouver Convention Centre, Vancouver, Canada
Duration: 8 Dec 201914 Dec 2019
Conference number: 33
https://nips.cc/Conferences/2019/

Conference

Conference33rd Conference on Neural Information Processing Systems
Number33
LocationVancouver Convention Centre
Country/TerritoryCanada
CityVancouver
Period08/12/201914/12/2019
Internet address

Fingerprint

Dive into the research topics of 'LAVAE: Disentangling Location and Appearance'. Together they form a unique fingerprint.

Cite this