Machine Learning–Assisted Sampling
Location: CECAM-ES
Organisers
Over the past decades, Artificial Intelligence (AI) and Machine Learning (ML) have taken a huge leap forward along many avenues [1], should those be classification and supervised learning [2], or generative models and unsupervised learning [3]. During this time, while many applications have solidly established themselves in many applied fields —for instance, helping to generate images [4], to develop computer codes [5], or to automatically classify many kinds of objects— ML has also been developed and widely adopted in scientific research [6]. Among many tasks, a standout in recent times is the use of ML methods to improve the simulation of physical systems and, in particular, to help in sampling from complex distributions [7].
Sampling is ubiquitous in the simulation of physical systems. In molecular simulations, sampling is used to simulate the interaction between the particles and the environment [8]. In complex systems, the behaviour of agent/spin variables is subject to stochasticity by its contact to a thermal bath; sampling is then a concrete way to investigate the equilibrium physics. Finally, quantum systems are stochastic by nature and their simulation thus requires the use of sampling techniques [9].
Moreover, physical systems often consist of a large number of coupled elements, whose interactions give rise to highly complex probability distributions that are difficult to analyze or sample from. Sampling from such distributions is typically computationally expensive. First, in many cases, particles must be updated sequentially, which prevents efficient parallelization. Second, move proposals are often governed by local rules—typically derived from the detailed balance condition—so in systems with long-range correlations, equilibration can be extremely slow. Third, free-energy barriers between macroscopic states can lead to relaxation times that grow exponentially with the barrier height, which frequently scales with system size. As a result, generating ergodic Monte Carlo or Molecular Dynamics trajectories becomes a formidable task.
In recent years, many research groups have developed strategies to enhance simulations and accelerate the convergence of Monte Carlo methods by incorporating machine learning techniques [10, 11, 12]. For instance, variational approaches aim to approximate target distributions using neural networks that are more tractable than the true distributions themselves [13]. Other efforts focus on designing global or more efficient move proposals for Monte Carlo sampling by leveraging ML-based strategies [14, 15]. Reinforcement learning has also been employed to improve optimization within Monte Carlo frameworks [16]. At the opposite end of the spectrum, recent studies have shown how Bayesian estimators can be used to automatically infer scientific models directly from data [17], or to assign classical probabilistic models to complex many-body quantum wave functions [18].
These methods are now widely used across a broad range of contexts. Our proposal offers the opportunity to bring together a strong community of researchers—physicists in potential various different fields (condensed matter, particle physics, complex systems, etc)—who can not only leverage these novel techniques to perform faster and more accurate simulations, but also contribute to the development of new ML-based methods and rigorous benchmarks for testing their efficacy on challenging scientific problems. We firmly believe that this school is particularly timely, as the number of machine learning and neural network-based techniques is rapidly growing, consistently yielding robust results that demonstrate their potential as powerful alternatives to traditional methods.
Our proposal thus aims to offer lectures at the forefront of ML-assisted sampling methods, with the primary goal of preparing the next generation of PhD students and postdoctoral researchers to harness modern techniques that are poised to fundamentally transform numerical scientific practice. The proposed list of topics includes:
- Auto-Regressive Models and Tensor Networks: AR models are a class of generative models that take advantage of conditional distributions to generate new samples variable by variable using a tractable expression for the conditional probability distribution [19,20]. A great advantage of such an approach is the possibility to have a tractable candidate model for the data of the physics system one is considering. Tensor networks are a neural-network approach to variational methods, with applications typically in quantum physics [21,22]. On these topics, we have invited Pan Zhang, a professor at the Department of Theoretical Physics at the Chinese Academy of Sciences. Prof. Zhang is an expert on various sampling techniques: he has turned toward AR models in the last decade and also deals with tensor network states, applied to quantum systems.
- Flow-Aided Methods for Sampling: In the last decade, the spectacular development of generative models has attracted the attention of many researchers in the field of Monte Carlo simulation. These approaches can produce better models to sample from or eventually provide new ways of proposing stochastic moves for the simulation algorithm, as well as approximating the empirical distribution of interest to speed up the mixing time of Monte Carlo simulations [11,23,24]. On this topic, we have invited Marylou Gabrié, a young professor at “École Normale Supérieur” in Paris. Prof. Gabrié is an expert on using ML methods to boost Monte Carlo simulations and on flow-matching models in ML.
- Neural Networks for Quantum-System Simulation: The simulation of quantum systems is notoriously challenging. While classical systems can already be difficult, considering that the configuration space of the system grows exponentially with the number of particles, quantum systems are in general more complex and therefore harder to simulate for large system sizes. Approaches based on neural networks have been developed in recent years and, in particular, ML-based methods are now performing very well [18,25,26]. On that topic, we have invited Zakari Denis, a postdoctoral researcher that works on these methods in the group of Giuseppe Carleo at EPFL in Switzerland, a renowned expert in the field.
- Learning Mathematical Models for Inference: When performing simulations, scientists often have to abandon analytical and explainable formulations of the problems in favour of empirical methods that often cannot be related to mathematical formulas and, therefore, are not helpful for further analytical treatment [17,27,28,29]. Important novel efforts are being made to buck this trend, so we have invited Roger Guimerá, a senior scientist that has made significant contributions to methods based on inferring mathematical expression from data, based on a Bayesian formulation of the problem and therefore prone to adjustable parametrization.
- Learning Dynamical Laws from Trajectories Using GNNs: A fundamental challenge in the simulation of physical systems is understanding the forces that govern their stochastic dynamics, especially when explicit analytical models are out of reach. Recent advances in ML have opened new avenues to learn such models directly from data [30]. On this topic, we have invited Miguel Ruiz-Garcia, a young Ramón y Cajal fellow in Madrid (the Spanish tenure track), whose work exemplifies the use of ML to accelerate and deepen our understanding of complex systems. He developed ActiveNet, a Graph Neural Network framework capable of inferring both deterministic and stochastic components of the forces driving interacting active particles, directly from observed trajectories [30]. This includes not only active and inter-particle forces but also torques and diffusion coefficients. Miguel’s approach allows the reconstruction of full dynamical equations of motion from experimental data—providing an innovative sampling tool for systems out of equilibrium. His work offers a prime example of how ML can be used not only to sample more efficiently but also to learn the very rules that govern physical systems.
References
Aurélien Decelle (Universidad Politécnica de Madrid) - Organiser
Daniel Matoz (Complutense University of Madrid) - Organiser
Beatriz Seoane (Universidad Complutense de Madrid) - Organiser
David Yllanes (Universidad de Zaragoza) - Organiser
Switzerland
Elisabeth Agoritsas (University of Geneva) - Organiser

About