Machine Learning Advances for Molecular and Materials Property Prediction
Location: CECAM-US-CENTRAL, University of Notre Dame, Indiana
Organisers
In recent years, machine learning (ML), and particularly deep learning, have become powerful tools for molecular and materials properties prediction. A variety of representations and model architectures have been designed to extract features and predict properties. Common representations range from graphs, over sets of atomic coordinates, to precomputed feature vectors and even plain SMILES/SELFIES strings. On the model architecture side, kernel regression and different flavors of neural networks dominate the field. In particular, end-to-end trainable models, such as graph convolutional neural networks, transformers and/or deep tensor neural networks, have become increasingly popular lately. The ML models developed so far have been successfully applied to the prediction of reactive, physicochemical and pharmacological properties of molecules and materials, and they have also found productive applications in spectroscopy, facilitating automated characterization of IR, UV/vis and NMR spectra among others. Additionally, ML-based techniques are also being used to speed-up physical simulations dramatically with the help of neural network potentials (NNP). In parallel to methodological innovations, a lot of effort has also been devoted to the curation of vast property datasets, which facilitate the training and benchmarking of new model architectures. Collectively, the various advances made in recent years open the door to accelerated chemical and materials characterization and discovery.
Achieving further breakthroughs across the field of data-driven molecular and material property prediction will require advances in data curation and generation, featurization and/or descriptor development, and predictive models and algorithms. Some specific topics of interest to this workshop include, but are not limited to:
- What are the practical and fundamental obstacles for the implementation and use of ML-based interatomic potentials for molecular simulations with enhanced sampling techniques?
- How can we determine and interpret collective variables and order parameters determined from ML analysis of trajectories?
- How can ML facilitate the featurization and efficient exploration of large chemical and material spaces along with determining properties of interest?
- How can ML be used to accurately describe data sparse regimes of chemical and material spaces?
- How can large language models (LLMs) be leveraged or extended for chemical/material property predictions?
As this research topic spans multiple disciplines, with expertise distributed globally, this workshop will congregate an interdisciplinary and international group of scientists to discuss and guide the most impactful advances in the use and development of ML for molecular and materials property predictions.
References
Thijs Stuyver (Ecole National Superieure de Chimie de Paris) - Organiser & speaker
United States
Rose Cersonsky (University of Wisconsin) - Organiser & speaker
Yamil Colon (University of Notre Dame) - Organiser & speaker
Edward Maginn (University of Notre Dame) - Organiser