Abstract
In this paper, we show that reconstruction of an image passed through a neural network is possible, using only the locations of the max pool activations. This was demonstrated with an architecture consisting of an encoder and a decoder. The decoder is a mirrored version of the encoder, where convolutions are replaced with deconvolutions and poolings are replaced with unpooling layers. The locations of the max pool switches are transmitted to the corresponding unpooling layer. The reconstruction is computed only from these switches without the use of feature values. Using only the max switch location information of the pool layers, a surprisingly good image reconstruction can be achieved. We examine this effect in various architectures, as well as how the quality of the reconstruction is affected by the number of features. We also compare the reconstruction with an encoder with randomly initialized weights with an encoder pretrained for classification. Finally, we give recommendations for future architecture decisions.
Original language | English |
---|---|
Journal | IEEE Signal Processing Letters |
Volume | 24 |
Issue number | 3 |
Pages (from-to) | 254 - 258 |
ISSN | 1070-9908 |
DOIs | |
Publication status | Published - 2016 |
Bibliographical note
(c) 2017 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works.Keywords
- Image reconstruction
- Convolutional neural networks
- Pooling
- Autoencoder
- Encoding
- Unpooling
- Deconvolution