Band gap prediction for large organic crystal structures with machine learning

Bart Olsthoorn, R. Matthias Geilhufe, Stanislav S. Borysov, Alexander V. Balatsky*

*Corresponding author for this work

Research output: Contribution to journalJournal articleResearchpeer-review


Machine-learning models are capable of capturing the structure-property relationship from a dataset of computationally demanding ab initio calculations. Over the past two years, the Organic Materials Database (OMDB) has hosted a growing number of calculated electronic properties of previously synthesized organic crystal structures. The complexity of the organic crystals contained within the OMDB, which have on average 82 atoms per unit cell, makes this database a challenging platform for machine learning applications. In this paper, the focus is on predicting the band gap which represents one of the basic properties of a crystalline materials. With this aim, a consistent dataset of 12 500 crystal structures and their corresponding DFT band gap are released, freely available for download at An ensemble of two state-of-the-art models reach a mean absolute error (MAE) of0.388 eV, which corresponds to a percentage error of 13% for an average bandgap of 3.05 eV. Finally, the trained models are employed to predict the bandgap for 260 092 materials contained within the Crystallography Open Database (COD) and made available online so that the predictions can be obtained for anyarbitrary crystal structure uploaded by a user.
Original languageEnglish
Article number1900023
JournalAdvanced Quantum Technologies
Issue number7-8
Number of pages10
Publication statusPublished - 2019

Fingerprint Dive into the research topics of 'Band gap prediction for large organic crystal structures with machine learning'. Together they form a unique fingerprint.

Cite this