CECAM - Machine Learning–Assisted SamplingMachine Learning

Registration deadline for abstracts: Monday March 9, 2026 [please submit the abstract for a poster presentation in the "motivation section" upon registration].

Notification of accepted participants: Friday March 27, 2026

Fees: The registration fee is €100, which covers lunch and coffee breaks throughout the five-day school.

Note: This is an onsite-only school.

Lecturers :

Federico Ricci-Tersenghi (Auto-Regressive Models)
Miguel Ruiz-Garcia (Learning Dynamical Laws from Trajectories Using GNNs)
Zakari Denis (Neural Networks for Quantum-System Simulation)
Roger Guimerá (Learning Mathematical Models for Inference)
Xavier Waintal (Tensor Networks)

Over the past decades, Artificial Intelligence (AI) and Machine Learning (ML) have taken a huge leap forward along many avenues [1], should those be classification and supervised learning [2], or generative models and unsupervised learning [3]. During this time, while many applications have solidly established themselves in many applied fields —for instance, helping to generate images [4], to develop computer codes [5], or to automatically classify many kinds of objects— ML has also been developed and widely adopted in scientific research [6]. Among many tasks, a standout in recent times is the use of ML methods to improve the simulation of physical systems and, in particular, to help in sampling from complex distributions [7,8].

Sampling is ubiquitous in the simulation of physical systems. In molecular simulations, sampling is used to simulate the interaction between the particles and the environment[30]. In complex systems, the behaviour of agent/spin variables is subject to stochasticity by its contact to a thermal bath; sampling is then a concrete way to investigate the equilibrium physics. Finally, quantum systems are stochastic by nature and their simulation thus requires the use of sampling techniques[31].

Moreover, physical systems often consist of a large number of coupled elements, whose interactions give rise to highly complex probability distributions that are difficult to analyze or sample from. Sampling from such distributions is typically computationally expensive. First, in many cases, particles must be updated sequentially, which prevents efficient parallelization. Second, move proposals are often governed by local rules—typically derived from the detailed balance condition—so in systems with long-range correlations, equilibration can be extremely slow. Third, free-energy barriers between macroscopic states can lead to relaxation times that grow exponentially with the barrier height, which frequently scales with system size. As a result, generating ergodic Monte Carlo or Molecular Dynamics trajectories becomes a formidable task.

In recent years, many research groups have developed strategies to enhance simulations and accelerate the convergence of Monte Carlo methods by incorporating machine learning techniques [9-12]. For instance, variational approaches aim to approximate target distributions using neural networks that are more tractable than the true distributions themselves [13]. Other efforts focus on designing global or more efficient move proposals for Monte Carlo sampling by leveraging ML-based strategies [14,15]. Reinforcement learning has also been employed to improve optimization within Monte Carlo frameworks [16]. At the opposite end of the spectrum, recent studies have shown how Bayesian estimators can be used to automatically infer scientific models directly from data [17], or to assign classical probabilistic models to complex many-body quantum wave functions [18].

These methods are now widely used across a broad range of contexts. Our proposal offers the opportunity to bring together a strong community of researchers—physicists in potential various different fields (condensed matter, particle physics, complex systems, etc)—who can not only leverage these novel techniques to perform faster and more accurate simulations, but also contribute to the development of new ML-based methods and rigorous benchmarks for testing their efficacy on challenging scientific problems. We firmly believe that this school is particularly timely, as the number of machine learning and neural network-based techniques is rapidly growing, consistently yielding robust results that demonstrate their potential as powerful alternatives to traditional methods.

Our proposal thus aims to offer lectures at the forefront of ML-assisted sampling methods, with the primary goal of preparing the next generation of PhD students and postdoctoral researchers to harness modern techniques that are poised to fundamentally transform numerical scientific practice. The proposed list of topics includes:

Auto-Regressive Models: AR models are a class of generative models that take advantage of conditional distributions to generate new samples variable by variable using a tractable expression for the conditional probability distribution [19,20]. A great advantage of such an approach is the possibility to have a tractable candidate model for the data of the physics system one is considering. On this topics, we have invited Federico Ricci-Tersenghi, a professor at La Sapienza, university of Rome. Prof. Ricci-Tesenghi is an expert on various Monte Carlo based sampling techniques[21] and lately studied the performance of ML-assisted MC sampled based on Auto-regressive model[22,23].
Learning Dynamical Laws from Trajectories Using GNNs: A fundamental challenge in the simulation of physical systems is understanding the forces that govern their stochastic dynamics, especially when explicit analytical models are out of reach. Recent advances in ML have opened new avenues to learn such models directly from data [24]. On this topic, we have invited Miguel Ruiz-Garcia, a young Ramón y Cajal fellow in Madrid (the Spanish tenure track), whose work exemplifies the use of ML to accelerate and deepen our understanding of complex systems. He developed ActiveNet, a Graph Neural Network framework capable of inferring both deterministic and stochastic components of the forces driving interacting active particles, directly from observed trajectories [24]. This includes not only active and inter-particle forces but also torques and diffusion coefficients. Miguel’s approach allows the reconstruction of full dynamical equations of motion from experimental data—providing an innovative sampling tool for systems out of equilibrium. His work offers a prime example of how ML can be used not only to sample more efficiently but also to learn the very rules that govern physical systems.
Neural Networks for Quantum-System Simulation: The simulation of quantum systems is notoriously challenging. While classical systems can already be difficult, considering that the configuration space of the system grows exponentially with the number of particles, quantum systems are in general more complex and therefore harder to simulate for large system sizes. Approaches based on neural networks have been developed in recent years and, in particular, ML-based methods are now performing very well [18,25,26]. On that topic, we have invited Zakari Denis, a postdoctoral researcher that works on these methods in the group of Giuseppe Carleo at EPFL in Switzerland, a renowned expert in the field.
Learning Mathematical Models for Inference: When performing simulations, scientists often have to abandon analytical and explainable formulations of the problems in favour of empirical methods that often cannot be related to mathematical formulas and, therefore, are not helpful for further analytical treatment [17,27,28,29]. Important novel efforts are being made to buck this trend, so we have invited Roger Guimerá, a senior scientist that has made significant contributions to methods based on inferring mathematical expression from data, based on a Bayesian formulation of the problem and therefore prone to adjustable parametrization.
Tensor Networks: Tensor networks provide a unifying framework to efficiently represent and manipulate high-dimensional quantum states and classical probability distributions by exploiting their underlying entanglement or correlation structure. Beyond their original development in strongly correlated quantum many-body physics, tensor networks have become powerful tools in machine learning, where they enable compact representations of complex datasets and tractable learning algorithms. They also offer controlled and efficient sampling strategies for high-dimensional distributions, bridging ideas from statistical physics, quantum information, and modern data science. This course will introduce the core concepts and practical methods of tensor networks, and will be delivered by Xavier Waintal, a leading expert in the field[32, 33].

Machine Learning–Assisted Sampling

Location: CECAM-ES

Organisers

References