Machine learning is a subfield of AI, where an algorithm learns from examples to establish a functional mapping from input to output, and improves the mapping upon training. MLAs therefore rely on a ‘training set’ of systems from which the algorithm learns. Each computation in the training set is described by a ‘vector descriptor’ which contains a unique and meaningful way to describe the computed material. The training set also contains a number of target properties for each material. If the size of the training set is large enough, the MLA can figure out how the vector descriptors and the targeted properties are correlated. In order to obtain reliable outputs from the MLA, the data in the training set must be reliable. On the other hand, producing data to train the MLAs can be time-consuming. Thus, the method to produce the data in the training set must also be affordable. The delicate balance between reliability and affordability depends on the targeted property. In some cases, it is better to have a vast amount of data with moderate fidelity, while in other situations it is more convenient to use a limited amount of high-fidelity data (here fidelity is understood as the degree to which a simulation reproduces the state and evolution of a set of given properties of a physically real entity). In this chapter, we present examples of the two situations in the context of microscopic modelling of batteries. First, we show how to produce a large set of data with moderate fidelity by means of a computational workflow. That workflow, based on Density Functional Theory (DFT) simulations, is able to predict open circuit voltages (OCV) and diffusivities of electrode materials that can be later usedas input parameters in macroscopic models based on Finite Difference Elements (FDE). Secondly, we present how to produce high-fidelity data on the formation of solid electrolyte interphase (SEIs) by means of ab initio molecular dynamics. We conclude by illustrating how these data sets are employed to train the MLAs.
This chapter is arranged as follows. Section 11.2 presents the global structure of the workflow which creates moderate-fidelity data and describes the workflow to produce a large set of moderate-fidelity data on OCVs, mechanical stability, and cation diffusivity in intercalation electrodes. This workflow relies on several novel computational techniqueswhich contribute to accelerate the data production and enhance its reliability. In subsection 11.2.1 we show how diffusivity is calculated within the workflow explaining how reflective symmetry can be exploited to boost Nudged Elastic Band (NEB) calculations (22.214.171.124) and discussing the importance of the choice of the right exchange-correlation functionals (126.96.36.199). Subsection 11.2.2 deals with the modelling of disorder in battery electrode materials, which is also part of the workflow. Section 11.3 shows one example of computational production of high-fidelity data, namely the use of ab initio molecular dynamics to understand the reduction reactions that bring to the first stages in the formation of SEIs. We conclude with a section 11.4 MLAs explaining how they can help to predict synthesizability and structure of battery materials and the evolution of interfaces based on high- and moderate-fidelity computational data.
|Title of host publication||Atomic-Scale Modelling of Electrochemical Systems|
|Editors||Marko M. Melander, Tomi T. Laurila, Kari Laasonen|
|Number of pages||26|
|Publication status||Published - 2022|