AIChemist
Location: CECAM-HQ-EPFL, Lausanne, Switzerland
Organisers
The primary goal of the school is to provide young and established scientists (both at industry and academy) working at the intersection of explainable AI and Drug Discovery with a comprehensive overview of the most cutting-edge methods for predicting chemical reactivity, as well as a deep understanding of the remaining hurdles to be overcome in order for the field to advance further. Indeed, in the last few years Machine Learning (ML) has allowed to built upon large databases of chemical reactions to allow predictions of chemical reactions from given reactants, in a way similar to sentence translation. It is then possible to explore the chemical space by following reaction routes, to predict synthesis pathways or retrosynthesis pathways etc. Furthermore, an active field of ML for chemistry is the optimization of single step reaction conditions. The data in that case can be very different whether it comes from High Throughput Experiments (HTE), where 100's of thousands of reactions are performed automatically in a small chemical space and range of reaction conditions, or if it comes from a collection of standard laboratory works, like published works, that can be collected in databases such as the Open Reaction Database. In the latter case, the chemical space can be very large and the reaction conditions very broad and sometimes less documented, for example when using laboratory notebooks as source of information. Choosing the appropriate methods fitting the obejective pursued with the data at disposal is then of primary importance and this school will provide tools for students to do so. In parallel, progress has been made as well on the understanding of chemical reactions thanks to ML techniques. ML has been used to develop force-fields or potential energy surfaces, MLP's (Machine Learned Potentials) that have been leveraged to perform Molecular Dynamics simulations. This has led to the accurate description of reaction mechanics at the atomic level. At the same time, new methods in Artificial Intelligence (AI) have been developed to provide explanations along predictions. This so-called explanable AI can not only bring new information from the data but also reinforce the liability of ML, in particular when extrapolating from known data.
These new developments are of interest to a broad range of practitioner of chemistry, and we propose to hold a school on this rapidly evolving field. The school will be co-organised with MSCA Doctoral Network “Explainable AI for Molecules - AiChemist” and as such will be geared towards the 14 doctoral candidates that were recruited into the network. As reaction predictions and explainable AI, both in the context of drug discovery and in general chemistry, is a burgeoning topic, we anticipate that this school will be of great interest to a large number of young scientists as well as academic and industrial researchers.
To start, different topics of ML for chemical reactions will be covered: prediction of chemical reactions and space exploration, single step models and optimization, planning of synthesis and optimization of reactions, machine learning potentials and description of reaction mechanisms, as well as extensions of these methods to biological systems. The participants will then be introduced to advanced ML methods (e.g. graph neural networks, Transformers) that are currently being applied to chemical reactivity problems, as well as methods that could potentially gain traction in this field in the coming years. Finally, a session will be dedicated to explainable AI and Large Language Models.
The participants will benefit from the know-how of world-renowned experts in reaction prediction and AI (explainable AI in particular) from Europe and the United States, and will gain insight into methodologies currently being pursued in the computing, pharmaceutical and life-science industries (including AstraZeneca, IBM and Reaxys), which will be invaluable for students that plan on a building a career in industry but as well as established scientists looking to use automated synthesis prediction in their work. Three hands-on session will allow for the participants to get familiar to common ML libraries.
References
Rodolphe Vuilleumier (Sorbonne Université - ENS-PSL) - Organiser
Germany
Katya Ahmad (Helmholtz Munich) - Organiser
Igor Tetko (Institute of Structural Biology) - Organiser
Switzerland
Philippe Schwaller (EPFL) - Organiser