Cluster expansion (CE) has gained an increasing level of popularity in recent years, and its applications go far beyond its original root in binary alloys, reaching even complex crystalline systems often used in energy materials research. Similar to other modern machine learning approaches in materials science, many strategies have been proposed for training and fitting the CE models to first-principles calculation results. Here, we propose a new strategy for constructing a training set based on their relevance in Monte Carlo sampling for statistical analysis and reduction of the expected error. The CE model constructed from the proposed approach has lower dependence on the specific details of the training set, thereby increasing the reproducibility of the model. The same method can be applied to other machine learning approaches where it is desirable to sample relevant configurational space with a small set of training data, which is often the case when they consist of first-principles calculations.
Bibliographical noteFunding Information:
The authors acknowledge support from the European Union’s Horizon 2020 research and innovation programme under Grant Agreement No. 957189.
© 2021 The Author(s). Published by IOP Publishing Ltd.
- Cluster expansion
- Energy materials
- Machine learning
- Monte Carlo
- Phase transition