Towards in silico biological cell: Bridging experiments and simulations
CECAM-HQ-EPFL, Lausanne, Switzerland
Accurately modeling the spatial heterogeneity and biochemical processes within a bacterial cell over the time frame of a cell cycle requires integration of vast amounts of data from different sources. Knowledge of protein abundances, structures, and spatial distributions of some of the largest biomolecules, like ribosome and DNA, comes from proteomics, x-ray crystallographical studies, single molecule experiments and 3D cryoelectron tomography reconstructions. This information allows one to build realistic models of the molecular crowding within the cytoplasm of bacterial cells for some snapshot in time. To predict how cellular processes respond to environmental conditions, we need to know how the macromolecular distributions change with time. As the cellular processes are inherently stochastic in nature, computational study of bacterial cells demands development of novel methods that can treat not only cellular spatial and temporal time scales, but also cell-to-cell variations in the distributions. This Workshop provides the opportunity to bring researchers from the scientific fields of molecular simulations and system biology together with experimentalists who are labeling and imaging cellular components. We envision that this cross fertilization will lead to significant scientific advances in the computational modeling of an entire cell, its components, cell division and the coupling of universal processes of transcription and translation to the response of the cell’s metabolic network. The novelty of this workshop lies in its interdisciplinary character.
Molecular simulations allow for direct observation of the diffusion of cellular components on relatively short time scales and provide insight into the role of electrostatic and non-specific interactions between the molecules of the cell [1,2,3]. These interactions are already contributing to recent investigation of in vivo protein folding. Simulations accounting for the noise in biochemical reactions like the lac genetic switch that produces different phenotypic responses in E. coli can be achieved with reaction-diffusion master equations [4,5]. To investigate the reactions throughout a whole cell for an entire cell cycle, such stochastic simulations require the use of new GPU computational methodology. Calculation of the protein and mRNA distributions in a cell population comes with an oversimplification of the motion of the large cellular components which must be addressed to treat cell division. Systems biology studies have provided an in silico model of the core metabolic network for E. coli [6,7]. Using flux balance analysis, the steady-state fluxes through the approximately 2000 reactions can be determined for cells grown in a defined medium, but at the moment there is no computational approach that accounts for the cell to cell variation in the protein and mRNA distributions in determining the fluxes.
On the experimental side, the count distributions of over 1000 proteins have been determined with single molecule sensitivity in single E. coli cells using genetic labeling with various reporter proteins [8,9]. In the case of the lac genetic switch, the spatial distribution of the lac permease membrane protein was also obtained. Sampling from the protein distributions obtained from either single molecule experiments or stochastic cell simulations may provide sufficient constraints needed for flux balance analyses to generate growth rate distributions for a population of cells.
Ribosomes, the largest molecules in bacterial cells, occupy between 6%–15% of the cellular volume and are a critical component of the molecular crowding. Cryoelectron microscopy is able to assess the structure of ribosomes in vitro and in situ and classify subpopulations of ribosomes with different functional states [3,10]. These subpopulations can be related to cellular environments such as proximity to the cell membrane. Such exact spatial distributions of ribosomes allow more realistic stochastic simulations of cellular processes. In addition there are numerous examples of how atomic-level MD simulations can be used to fit atomic structures of the ribosome into low resolution electron microscopy maps of various states of protein synthesis. Indeed, simulations have been able to capture the atomic details of a nascent protein exiting the ribosomal large subunit into the SecYE complex in the membrane environment [11, 12]. This combination of experimental and computational techniques now paves the way for characterization of the complex cellular process of protein synthesis.
To move beyond the static picture of the cellular components requires also time dependent modeling of DNA packing and duplication in cell division [13-15] as well as the synthesis of proteins and RNA. Again a combination of single molecule experiments and computational models are beginning to provide more details regarding the kinetics of cell division. There are challenges to the simulation methods and the experimental techniques, but this workshop promises to establish the collaborations necessary to make serious advances towards providing in silico models of biological cells at various stages of the cell cycle.
The proposed workshop will bring together researchers from two basic and complementary approaches for simulating the dynamics of cellular processes within the cytoplasm of bacterial cells: particle-based Brownian dynamics (BD) algorithms [S1-S4] that have been used to describe the diffusive transport in crowded environments and reaction-diffusion master equations [S5-S8] that have been used to describe kinetics of genetic switches and biochemical pathways functioning in modeled biological cells. Both approaches have been able to reproduce either the behavior of certain macromolecules or phenotypic response of the cell modeled with in vivo conditions that have been observed experimentally. There are however limitations in both approaches requiring more efficient algorithms, better implementation on appropriate computer hardware, and interfacing of different modeling approaches which hopefully would bead dressed through the workshop. We describe briefly below the recent developments in both approaches and what synergistic interactions we anticipate.
The applications and further needed developments of BD to diffusion in the modeled environment of the bacterial cytoplasm have been recently reviewed by Trylska and co-workers [S1] and McCammon and Wade [S2]. The BD model of McGuffee and Elcock [S3] published in 2010 contained atomic models of the 50most abundant proteins and nucleic acids in the E. coli cytoplasm along with the reporter protein GFP. The electrostatics and hydrophobic interactions were based on previous models developed by Wade, but neglected hydrodynamic interactions included in an earlier study on protein-protein association rates [S9]. Nevertheless, this state of the art treatment was able to approximately reproduce experimentally measured diffusion coefficients and the reduction in in vivo protein diffusion. Another important contribution was also published in 2010 from Ando and Skolnick [S4] who investigated the hydrodynamic interactions using a system of spheres to model the cytoplasm with the radii being assigned based on computed translation diffusion coefficients. In this study the authors concluded that steric interactions and hydrodynamics dominate the in vivo motion. Even though the treatments of the electrostatic interactions were different, both studies showed the importance non-specific interactions in describing the in vivo dynamics. The review concludes that while BD simulations should be able to “predict in vivo dynamics there is a high computational cost [due to large sizes and long timescales of dynamical processes in biological cells] that needs to be addressed by the development of more efficient algorithms interfacing different modeling approaches – from atomistic to coarse-grained models, mesoscopic models of biological environments, and appropriate boundary conditions for different cell compartments”. Advances towards these long-range goals have already started in the recent publications of new BD packages for coarse-grained many particle simulations from Geyer [S10] that included fast hydrodynamics and from Trylska [S11] for thousand of particles that makes use of new GPU technology.
A widely used framework for modeling biochemical reactions and processes involving many reactants within biological cells is the stochastic reaction-diffusion master equation [S5]. Spatial heterogeneity arising from the cellular architecture and stochastic effects due to the small number of some of the reactants play important roles in the quantitative analysis of in vivo biochemical networks. Fluctuations in the cellular species can give rise to cell to cell variations in a population and different cell fates. Given the large size and long time scales of cellular processes, the RDME are commonly solved using both spatially and temporally discretized simulations. In this description of the lattice microbe, the physical space is divided into subvolumes and the state of the system is defined by the number of molecules of each reactant species in each subvolume. The reactions within each subvolume are typically solved using Gillespie like stochastic simulators [S12]. Major challenges to the RDME include providing consistency in the fitting of the microscopic and macroscopic kinetic reaction rates with the change in discretization scheme [S6], large variations in the diffusion coefficients of the reacting species, irregular boundaries of cellular components, and inclusion of molecular crowding into the RDME scheme [S7-8]. There have been several different approaches to these challenges. In the recent studies from Luthey-Schulten, the RDME simulations of the lac genetic switch in E.coli were implemented on GPUs so that the diffusion and reaction of active particles involved in the system of reactions could be followed over a period of a cell cycle of an hour. All other particles contributing to the molecular crowding were treated as obstacles affecting the probabilities for reactions in the subvolumes. Petzold [S13] and co-workers have suggested new hybrid schemes that handle diffusion and reactions with different algorithms and are more efficient for handling larger biological systems involving modules that evolve on different time scales. Finally there is a natural connection of the RDME simulations to the treatment of larger cellular networks treated in systems level studies. This has been done anecdotally for the metabolic/regulatory networks of E. coli and for a kinetic model of the chromatophore in a bacterial photosynthetic system by Geyer and Helms [S14].By having colleagues from both approaches present at the workshop, we anticipate that a number of important new problems can be addressed. Variationin cell size through division is a challenge to both RDME and BD approaches and one of the areas where they may be combined through an iterative process whereby the arrangements of reactive species and obstacles would be updated. Information about non-specific interactions from the BD simulations could be introduced probabilistically into the RDME treatment, but the details and algorithms need to be discussed. How would both approaches treat a mixture of intracellular active and diffusive transport phenomena mostly efficiently? Parameter estimation of the diffusion coefficients is critical to both approaches. Packing of supercoiled DNA into the nucleoid region of bacterial cells which may contribute from 10-15 % of the cell volume is an important component of molecular crowding for both approaches, and there have been a number of important experimental observations that will help guide the choice of the models and algorithms. As any new approaches or models should be verified experimentally to establish their quality and predictive power, it will also be important to have a number of leading experimentalist participate in the meeting.
The exact choice of problems to be addressed at the workshop will ultimately depend on the participants, but there are substantial overlapping interests in developing in silico models of biological processes in which a large number of reactants participate and the cell architecture and molecular crowding are more realistically described.
Adrian Elcock ( University of Iowa, Iowa City ) - Organiser & speaker
Zaida Luthey-Schulten ( University of Illinois Urbana Champaign ) - Organiser & speaker