Actively Learning Materials Science (ALMS 2023)
Location: CECAM-FI
Organisers
Abstracts and registration accepted at the Workshop website: www.utu.fi/al4ms2023
The influx of machine learning algorithms from computer science into computational materials science (MS) have led to developments of innovative computational methodologies and opened up novel routes to addressing outstanding problems. Of particular interest are active learning (AL) algorithms, such as Bayesan optimisation (BO) [1], where machine learning datasets are collected on-the-fly in the search for optimal solutions. AL methods were employed with considerable success to optimal design of experiments [2, 3], efficient traversal of complicated search spaces for electronic structure simulations [4, 5], hyperparameter optimization [6], high throughput screening [7, 8] and in automated laboratories discoveries [9,10].
The strength of AL techniques is that the machine learning model selects the data to include into the dataset via acquisition strategies [1]. The requested data points can then be evaluated via computation or experiment and included into the model iteratively, until the optimal solution converges. The resulting compact, maximally informative datasets make AL particularly suitable for applications where data is scarce or data acquisition expensive. In this way, AL has helped accelerate materials discovery [2] away from big-data and free of human bias. Despite recent successes, future applications of AL on experimental data are slow, given that key data infrastructure is still lacking [11]. Working with multiple objectives, or multidimensional data [12] remains challenging. Novel method development across the research field is needed to advance AL techniques and associated frameworks in materials research.
In this workshop, we will focus on two key aspects, both from a pedagogical (first part of the event) as well as from an advanced perspective (second part of the event) :
1) How could data infrastructures and AL algorithm development advance experimental materials discovery?
Currently, materials science is experiencing a period of rapid parallel developments in data infrastructure, i.e., electronic laboratory notebooks, open data platforms and interfaces, materials property databases and ontologies [11,13,14]. By the same token, high-throughput laboratories and automated labs enable the efficient and standardized acquisition of data. Through our event we aim to promote the coordination of current and future data infrastructures to facilitate the wider adoption and distribution of FAIR protocols and datasets. Another key objective is to exchange views on optimal strategies for data acquisition in complex situations. These include additional considerations like experimental time and cost, heteroscedastic (variable) noise, and developing efficient and informative batch (parallel) acquisitions protocols.
2) How could we combine multiple channels of information in the same AL model?
Predicting and optimizing the key performance indicator of a material is made challenging by the intrinsic multi-scale nature of this problem. Further data acquisition is generally expensive, and, oftentimes, multiple materials’ properties should be optimized simultaneously (e.g., the strength and weight of an alloy or of a polymer thermoplastic, the stability and energy storage of a battery). Multi-modal machine learning schemes that combine different channels of information have emerged as promising tools for complex prediction tasks. In our event, we will review recent advances in multi-modal techniques such as multi-scale data-fusion approaches [15], human-in-the loop strategies [16], expert knowledge biasing, multi-fidelity statistical methods [17], and multi-objective optimization algorithms [18]. The workshop objective is to expose the next generation of scientists to the state-of-the-art in these techniques, and promote cutting-edge discussion on how to advance them.
References
Matthias Stosiek (Aalto University) - Organiser
Armi Tiihonen (Aalto University) - Organiser
Milica Todorovic (University of Turku) - Organiser
Germany
Patrick Rinke (Technical University Munich) - Organiser
Netherlands
Kevin Rossi (TU Delft) - Organiser