Decoding Allostery in Protein-Nucleic Acid Complexes: The AI Revolution
Location: CECAM-FR-MOSER
Organisers
Background. Biomolecules such as proteins and nucleic acids drive virtually all cellular processes by assembling into macromolecular complexes whose functions are intrinsically linked to their structure and dynamics. These complexes undergo conformational changes in response to environmental conditions such as mutations, PTMs, temperature shifts, electrostatic variations, or binding events. Such perturbations can propagate across long distances within a complex via allostery, a fundamental mechanism of molecular regulation. Allosteric signaling underlies critical biological processes and, when dysregulated, contributes to disease. Despite its importance, the molecular principles of allostery remain poorly understood [1], in part due to the difficulty of capturing dynamic, ensemble-based behavior at high resolution.
AI revolution. The advent of deep learning-based structure prediction tools, such as AlphaFold2 [2], AlphaFold3 [3], and RoseTTAFoldNA [4], has dramatically advanced our ability to determine static protein and nucleic acid structures. However, these models offer limited insight into conformational dynamics. The next frontier is to develop dynamics-aware AI methods [5] that can account for the structural heterogeneity underpinning functionally relevant phenomena like allostery. Progress in this area is hindered by the limited availability of annotated dynamic data. To bridge this gap, community initiatives such as MDRepo and MDDB are building repositories for molecular dynamics (MD) simulation data. At the same time, novel strategies are being explored to extract interpretable features from MD data for use in deep learning frameworks [6].
State of the art. Graph-theoretical models have proven useful for mapping allosteric communication pathways. Community network analysis, for example, represents biomolecules as graphs with residues or nucleotides as nodes and their correlated motions as edges [7]. These models have been used to identify communication networks, shortest paths, and mutation effects [8]. While machine learning has been employed to study allostery, the application of graph neural networks (GNNs) remains limited [9], largely due to data scarcity. One recent study combined GNNs with a relational inference model to predict allosteric pathways, but the approach was constrained by a variational autoencoder architecture, which limits scalability beyond proteins of ~300 residues [10].
Objectives. Most graph-theory-based methods for studying allosteric mechanisms rely on arbitrary cutoffs and are tailored to specific case studies, limiting their generalizability. While deep learning models have advanced in extracting structural and sequence-based patterns from proteins, they largely overlook conformational heterogeneity due to the scarcity of dynamic data and reliance on limited training datasets. Additionally, current methods for predicting allosteric pathways rarely address allosteric signaling within protein–nucleic acid complexes, leaving a significant gap. In this workshop, we aim to explore the integration of AI-based approaches to develop a universal framework for describing allosteric mechanisms in diverse macromolecular complexes. Another important topic will be the design of intelligent proteins [11] with specific functions. We also aim to discuss strategies for predicting allosteric binding sites and designing corresponding allosteric modulators. While ongoing efforts are addressing these challenges, major improvements are still needed. We have considered 4 sessions to discuss the mentioned ideas: Allosteric regulations, AI-based method developments, protein-nucleic acid interactions and force field developments.
References
Marc Baaden (Institut de Biologie Physico-Chimique (IBPC)) - Organiser
Emmanuelle Bignon (LPCT, CNRS, Universite de Lorraine) - Organiser
Yasaman Karami (Inria, Université de Lorraine) - Organiser

About