Structuring Neural Networks for More Explainable Predictions

Laura Rieger*, Pattarawat Chormai, Grégoire Montavon, Lars Kai Hansen, Klaus-Robert Müller

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingBook chapterResearchpeer-review


Machine learning algorithms such as neural networks are more useful, when their predictions can be explained, e.g. in terms of input variables. Often simpler models are more interpretable than more complex models with higher performance. In practice, one can choose a readily interpretable (possibly less predictive) model. Another solution is to directly explain the original, highly predictive model. In this chapter, we present a middle-ground approach where the original neural network architecture is modified parsimoniously in order to reduce common biases observed in the explanations. Our approach leads to explanations that better separate classes in feed-forward networks, and that also better identify relevant time steps in recurrent neural networks.
Original languageEnglish
Title of host publicationExplainable and Interpretable Models in Computer Vision and Machine Learning
Publication date2019
ISBN (Print)978-3-319-98130-7
Publication statusPublished - 2019


  • Interpretable machine learning
  • Convolutional neural networks
  • Recurrent neural networks

Fingerprint Dive into the research topics of 'Structuring Neural Networks for More Explainable Predictions'. Together they form a unique fingerprint.

Cite this