Co-evolutionary methods for the prediction and design of protein structure and interactions
- Alessandro Barducci (Centre de Biochimie Structurale , France)
- Paolo De Los Rios (Swiss Federal Institute of Technology Lausanne (EPFL), Switzerland)
Recent advances in genomic sequencing technologies have led to an astonishing increase in the size of protein sequence databases. This progress greatly renewed the interest in computational methods aimed at exploiting sequence datasets to explore structural and functional properties of proteins. Particularly, the analysis of correlated mutations has attracted an ever-increasing popularity in the last decade and it is currently one of the most promising tool in structural and computational biology. In a nutshell, co-evolutionary analysis relies on the observation that pairs of residues that display highly correlated mutations tend to be located in close proximity in the protein three-dimensional structure. Notably, Direct Coupling Analysis (DCA) has emerged as the most promising method to detect the truly interacting pairs of residues by efficiently disentangling direct from less-informative, mediated correlations[2,3]. Since its inception, DCA has seen different methodological incarnations ranging from analytical mean-field strategies to more computationally demanding pseudo-likelihood approximations, up to full-fledged computational Boltzmann learning. The combination of the co-evolutionary information with molecular modeling techniques has already allowed the accurate determination of high-resolution protein structures [3,4], even for experimentally challenging targets[5,6], and it can be extended to other biomolecules, such as RNA. Beyond purely structural studies, it has been recognized early on that correlated mutations encode multiple conformations thus paving the way to investigate evolutionary-conserved conformational dynamics[8-10]. Furthermore, protein-protein interactions represent an exciting application of DCA where the simultaneous problem of predicting the interaction partners alongside the structure of the complexes pose a fascinating challenge[11,12]. Last but not least, co-evolutionary techniques offer an unprecedented opportunity to use the inferred statistical models for bioengineering purposes, by generating sequences that are optimized for some design requirements while respecting all the pairwise statistical couplings imposed by evolution.
Regardless of the early successes and impressive potential, co-evolutionary approaches are still in their infancy and face several challenges that require the combined efforts of a broad community, encompassing statistical physics, molecular modeling and bioinformatics. Outstanding open questions include:
How are DCA results affected by the phylogenetic breadth and depth of the multiple sequence alignments?
Can we use deep learning approaches to further push the contact prediction?
How can we estimate the likelihood of a predicted contact to be correct?
What is the correlation between the co-evolutionary statistical coupling and the physical interaction energy?
How can we efficiently disentangle structural heterogeneity when decoding co-evolutionary information?
How can we combine DCA with enhanced sampling strategies for determining conformational ensembles?
Can we elucidate complex protein-protein interaction networks on the basis of sequence covariation data?
The workshop will consist of six half-day sessions distributed on 4 days. We plan to have 22 talks (18 invited+4 contributed) and a poster session open to all participants. Following the CECAM guidelines we tried to optimize the structure of the workshop for maximizing the time devoted to scientific discussion and to foster exchange between researchers belonging to diverse scientific communities
Göbel et al. Proteins 18(4):309-17 (1994).
Weigt et al. PNAS 106(1):67-72. (2009).
Marks et al. PLoS One. 6(12):e28766. (2011).
Ovchinnikov et al. eLife. 4:e09248 (2015).
Hopf et al. Cell. 149(7):1607-21. (2012).
Malinverni et al. eLife.; 6:e23471. (2017).
De Leonardis et al. Nucleic Acids Res.; 43(21):10444-55. (2015).
Morcos et al. PNAS.;110(51):20533-8 (2013).
Sutto et al. PNAS.;112(44):13567-72 (2015).
Malinverni et al. PLoS Comput Biol.;11(6):e1004262 (2015).
Bitbol et al. PNAS.; 113(43):12180-12185 (2016).
Gueudré et al. PNAS; 113(43):12186-12191 (2016).