A Disentangled Recognition and Nonlinear Dynamics Model for Unsupervised Learning

Marco Fraccaro, Simon Due Kamronn, Ulrich Paquet, Ole Winther

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

Abstract

This paper takes a step towards temporal reasoning in a dynamically changing video, not in the pixel space that constitutes its frames, but in a latent space that describes the non-linear dynamics of the objects in its world. We introduce the Kalman variational auto-encoder, a framework for unsupervised learning of sequential data that disentangles two latent representations: an object’s representation, coming from a recognition model, and a latent state describing its dynamics. As a result, the evolution of the world can be imagined and missing data imputed, both without the need to generate high dimensional frames at each time step. The model is trained end-to-end on videos of a variety of simulated physical systems, and outperforms competing methods in generative and missing data imputation tasks.
Original languageEnglish
Title of host publicationProceedings of 31st Conference on Neural Information Processing Systems
Number of pages13
Publication date2017
Publication statusPublished - 2017
Event31st Conference on Neural Information Processing Systems - Long Beach, United States
Duration: 4 Dec 20179 Dec 2017

Conference

Conference31st Conference on Neural Information Processing Systems
Country/TerritoryUnited States
CityLong Beach
Period04/12/201709/12/2017

Fingerprint

Dive into the research topics of 'A Disentangled Recognition and Nonlinear Dynamics Model for Unsupervised Learning'. Together they form a unique fingerprint.

Cite this