Abstract
We propose a probabilistic generative model for unsupervised learning of
structured, interpretable, object-based representations of visual scenes. We
use amortized variational inference to train the generative model end-to-end.
The learned representations of object location and appearance are fully
disentangled, and objects are represented independently of each other in the
latent space. Unlike previous approaches that disentangle location and
appearance, ours generalizes seamlessly to scenes with many more objects than
encountered in the training regime. We evaluate the proposed model on
multi-MNIST and multi-dSprites data sets.
Original language | English |
---|---|
Title of host publication | Proceedings of Workshop on Perception as Generative Reasoning |
Number of pages | 11 |
Publication date | 2019 |
Publication status | Published - 2019 |
Event | 33rd Conference on Neural Information Processing Systems - Vancouver Convention Centre, Vancouver, Canada Duration: 8 Dec 2019 → 14 Dec 2019 Conference number: 33 https://nips.cc/Conferences/2019/ |
Conference
Conference | 33rd Conference on Neural Information Processing Systems |
---|---|
Number | 33 |
Location | Vancouver Convention Centre |
Country/Territory | Canada |
City | Vancouver |
Period | 08/12/2019 → 14/12/2019 |
Internet address |