Deep learning and generative methods in cheminformatics and chemical biology: navigating small molecule space intelligently

Douglas B Kell*, Soumitra Samanta, Neil Swainston

*Corresponding author for this work

Research output: Contribution to journalJournal articleResearchpeer-review

156 Downloads (Pure)


The number of 'small' molecules that may be of interest to chemical biologists - chemical space - is enormous, but the fraction that have ever been made is tiny. Most strategies are discriminative, i.e. have involved 'forward' problems (have molecule, establish properties). However, we normally wish to solve the much harder generative or inverse problem (describe desired properties, find molecule). 'Deep' (machine) learning based on large-scale neural networks underpins technologies such as computer vision, natural language processing, driverless cars, and world-leading performance in games such as Go; it can also be applied to the solution of inverse problems in chemical biology. In particular, recent developments in deep learning admit the in silico generation of candidate molecular structures and the prediction of their properties, thereby allowing one to navigate (bio)chemical space intelligently. These methods are revolutionary but require an understanding of both (bio)chemistry and computer science to be exploited to best advantage. We give a high-level (non-mathematical) background to the deep learning revolution, and set out the crucial issue for chemical biology and informatics as a two-way mapping from the discrete nature of individual molecules to the continuous but high-dimensional latent representation that may best reflect chemical space. A variety of architectures can do this; we focus on a particular type known as variational autoencoders. We then provide some examples of recent successes of these kinds of approach, and a look towards the future.
Original languageEnglish
JournalBiochemical Journal
Issue number23
Pages (from-to)4559-4580
Publication statusPublished - 2020


  • Artificial intelligence
  • Cheminformatics
  • Deep learning


Dive into the research topics of 'Deep learning and generative methods in cheminformatics and chemical biology: navigating small molecule space intelligently'. Together they form a unique fingerprint.

Cite this