Skip to main navigation Skip to search Skip to main content

How to deal with missing data in supervised deep learning?

  • Université Côte d'Azur

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

570 Downloads (Orbit)

Abstract

The issue of missing data in supervised learning has been largely overlooked, especially in the deep learning community. We investigate strategies to adapt neural architectures for handling missing values. Here, we focus on regression and classification problems where the features are assumed to be missing at random. Of particular interest are schemes that allow reusing as-is a neural discriminative architecture. To address supervised deep learning with missing values, we propose to marginalize over missing values in a joint model of covariates and outcomes. Thereby, we leverage both the flexibility of deep generative models to describe the distribution of the covariates and the power of purely discriminative models to make predictions. More precisely, a deep latent variable model can be learned jointly with the discriminative model, using importance-weighted variational inference, essentially using importance sampling to mimick averaging over multiple imputations. In low-capacity regimes, or when the discriminative model has a strong inductive bias, we find that our hybrid generative/discriminative approach generally outperforms single imputations methods.
Original languageEnglish
Title of host publicationProceedings of 2022 International Conference on Learning Representations
Number of pages30
Publication date2022
Publication statusPublished - 2022
Event10th International Conference on Learning Representations - Virtual event
Duration: 25 Apr 202229 Apr 2022
Conference number: 10
https://iclr.cc/Conferences/2022

Conference

Conference10th International Conference on Learning Representations
Number10
LocationVirtual event
Period25/04/202229/04/2022
Internet address

Fingerprint

Dive into the research topics of 'How to deal with missing data in supervised deep learning?'. Together they form a unique fingerprint.

Cite this