Factors of two in biomolecular simulation: celebrating and commemorating
Biomolecular simulations were honored in 2013 by a Nobel prize to Karplus, Levitt and Warshel, celebrating work going back to the 1970s. 1976 saw the pioneering QM/MM work on lysozyme, as well as the first MD simulation of a protein by Karplus and his coworkers. 2020 and 21 mark Martin Karplus's 90th birthday and the 45th anniversary of that MD simulation. It is a good time to honor Martin and commemorate the 1976 CECAM workshop once more. Thus, the workshop we propose will look back. But it will mainly look forward, and discuss the latest advances and directions for the future. Indeed, in just the short time since 2013, the power and scope of biomolecular simulations have increased notably. We will address six important areas, which are all inter-related. Three are method areas: force fields, machine learning, landscape sampling. Three are application areas: enzymes, biomachines, drug design. They represent frontiers that face in different directions but encompass shared problems and goals. We discuss them below, indicating also people we plan to invite for each topic.
Force field advances and advanced force fields
Biomolecular force fields have always developed in two directions: increasing generality and increasing accuracy. Thus, additive force fields were developed first for small molecules, then proteins, then nucleic acids, lipids, sugars. For drug design, they must be extended to very large sets of drug-like molecules. This was accomplished by the CHARMM General Force Field (CGENFF), for example, which was published in 2010. For a long time, biomolecular simulations were based on additive, fixed point-charge force fields. They were highly successful in predicting important properties, such as protein folding and ligand binding. However, more accurate treatment of electrostatic interactions has long been recognised as key to modelling many properties and situations. This includes ionic groups that interact in heterogeneous environments such as macromolecular interfaces, where electronic polarizability can play an important role. Therefore, more sophisticated functional forms of the electrostatic interactions have been proposed. These include the use of distributed multipoles to provide a better representation of the molecular electrostatic potential, and models that explicitly represent electronic polarization, either through induced dipoles, Drude pseudo-particles, or fluctuating atomic charges. Note that both the Drude and Amoeba polarizable force fields for proteins were published in 2013.
Beyond polarisable force fields, QM/MM methods with explicit coupling of QM and MM regions have also been developed, allowing the study of enzyme reactions.
People: B Brooks (NIH), T Head-Gordon (Berkeley), A MacKerell (Maryland), JP Piquemal (Sorbonne), J Ponder (St Louis).
Enzymes: mechanisms and design
Enzymes are an essential area of application, since mechanisms are hard to elucidate by experiments alone and engineered enzymes are a major tool in biotechnology and green chemistry. Main routes are QM/MM on one hand and semi-classical EVB methods on the other (Empirical Valence Bond). The field has been transitioning in recent years to DFT-based methods associated with MD, thanks to rapidly increasing computer power in supercomputer centers. EVB is much less expensive and has been used for decades to perform alchemical free energy calculations of substrate and
transition state binding. EVB is also of current interest, as increased computer power allows thorough sampling and consideration of many relevant substates such as protonation, redox, or conformational states involved in allostery and induced fit.
Applications to large molecular machines such as the ribosome are another direction. People: J Aqvist (Uppsala), J Gao (Minnesota), A Mulholland (Bristol), S Kammerlin (Uppsala), U Rothlisberger (Zurich).
Machine learning developed dramatically in biology throughout the genomic revolution. Today, it is spreading into more and more fields, including chemistry and biophysics. Deep learning has been applied to force field tuning, to predict protein structures, to choose collective variables for metadynamics or sampling, and to make sense of large simulation data sets by extracting interesting features. Other classification methods are being used to extract information from sequence alignments and identify key amino acids for function. As we simulate larger and more complex systems, artificial intelligence will have a greater role to play for seting up and interpreting simulations.
People: F Noe (Berlin), M Parrinello (Zurich), L Delemotte (Sweden)
Experimental structural biology has made enormous steps in the last few years, with cryo-electron microscopy structures of large cellular machines being obtained at atomic resolution. This opens important new possibilities for simulations, and creates new needs, including needs for coarse-grained models, multi-scale models, and powerful sampling methods. Relevant systems include the ribosome, cellular motors, or entire cellular compartments ("whole cell" simulations).
People: M Feig (Michigan State), H Grubmuller (Gottingen), G Hummer (Frankfurt), S Marrink (Groningen), G Voth (Chicago)
MD directly probes the conformational landscape. However, the timescale available with plain MD is a few microseconds at most. For many or most functional motions, longer timescales are involved. To sample these, specific methods are used, such as coarse-graining, importance sampling, or kinetic models like Markov state models. A selection of these topics will be covered by our speakers.
People: R Best (NIH), C Post (Purdue), E Rosta (London), B Roux (Chicago), M Vendruscolo (Cambridge)
Drug design is a key target application. While fast and cheap methods have long been preferred, detailed and advanced simulations are increasingly relevant, as force fields expand and computer power increases. Alchemical free energy simulations are being actively developed for drug design in small and large companies, such as Novartis.
They have been shown to assist lead selection and optimization in a way that is becoming competitive with the expertise of human chemists. Their application to ionic ligands is a rather recent development that has benefited from theoretical advances. Smart sampling methods such as adaptive landscape flattening and orthogonal path sampling are under active development. In addition to free energy simulations, high throughput methods like protein and miniprotein design are relevant for therapeutic peptides. Recent designed miniproteins have effectively targeted SARS-Cov-2 proteins in vitro.
People: M Karplus (Harvard), G Bowman (St Louis), A Caflisch (Zurich), W Jorgensen (Yale), R Wade (Heidelberg)
Thomas Simonson (Ecole Polytechnique, Palaiseau) - Organiser
Charles Brooks (University of Michigan) - Organiser