Projects per year
Screening of molecules and materials is important in a number of scientific fields, e.g. discovery of new drugs and better materials for solar cells and batteries. Within the last few years machine learning and deep learning methods in particular have shown remarkable results in a wide range of scientific fields. In this thesis we develop new deep learning methods for screening of molecules and materials. How to appropriately represent molecules and materials for machine learning methods is an ongoing field of research. Molecular graphs for molecules and quotient graphs for crystal structures are general representations but they are difficult to handle for standard machine learning algorithms due to the non-vectorial nature of these representations. In the first part of the thesis we develop new methods based on graph neural networks, which is a class of deep learning models that can be formulated as message passing on graphs. We propose to extend existing models with an edge update network that improves upon previous state of art results on the three publicly available datasets. In many screening applications we do not know the exact positions of the atoms of a new candidate molecule or material. We therefore propose to use a graph neural network model without access to the exact interatomic distances and show that it is still possible to accurately predict the formation energy of materials using only the connectivity of the atoms. In some cases we know both the connectivity and the symmetries of a material and we propose a new method for encoding the local symmetry of a material into its graph representation and show the efficacy of the method on publicly available materials datasets. An important part of the screening process is to suggest new candidate materials and molecules. In the second part of the thesis we review current state of the art in deep learning methods for generating molecules. We focus on a specific latent variable model called the variational auto encoder, which is a model that learns a vector space representation of a given dataset. We use the model to accelerate the screening process of molecules for polymer solar cells by designing a grammar representation of the molecules and optimise for the optical properties of interest in the learned latent space. Finally we reflect on the results of this research and discuss directions for future research on deep learning for molecules and meterials.
|Publisher||Technical University of Denmark|
|Number of pages||114|
|Publication status||Published - 2019|