E-CAM Scoping Workshop. Solubility prediction
- Carine Michel (CNRS, Ecole Normale Supérieure de Lyon, France)
- Daan Frenkel (University of Cambridge, United Kingdom)
- Eduardo Sanz (Physical Chemistry Department, Chemistry Faculty, University Complutense of Madrid, Spain)
- Robert Docherty (Pfizer Limited, United Kingdom)
It has been reported that over 75% of drug development candidates have low solubility based on the Biopharmaceutics Classification System (BCS). An increasing trend towards low solubility is a major issue for drug development as formulation of low solubility compounds can be problematic. Despite tremendous efforts, a definitive accurate and comprehensive approach to predicting solubility has proven elusive. Consequently, there have been a number of attempts to probe changes in solubility as a function of structural changes in specific classes of molecules as well as systematic approaches looking at matched molecular pairs to determine improved solubility as a function of inferred crystal packing disruption. The focus of this workshop could be on the tools that allow an unprecedented deconstruction of the relative importance of molecular solvation and crystal packing on solubility. Recent work includes a systematic experimental approach to examine key thermodynamic functions such as sublimation and hydration properties as a function of structural modifications and a comprehensive computational approach to lattice energy estimation from molecular descriptors. A recent review has analysed simple predictive methods for the estimation of aqueous solubility and the specific use of a chemical informatics and theory to predict the solubility of drug like molecules . A recent paper highlights the potential of these approaches and the attempts to build scientific bridges across the two communities. The paper  uses co-crystals to optimise the dissolution rate of a psychotropic drug with known dissolution challenges.
Algorithms for solubility calculations have been carried out by two different general approaches :
(1) the thermodynamic approach (of seeking the concentration at which the electrolyte chemical potential, in solution, is equal to that of the pure solid (2) a direct coexistence approach in which the solution is equilibrated with a solid configuration (typically either a slab or a selected crystal environment) and the electrolyte concentration in the solution phase sufficiently far from the crystal surface is taken to be the solubility .
The algorithms for the calculation of solubility will be examined in detail at the workshop. Essentially, the chemical potential of a salt, in the solid phase is given by Gibbs free energy per molecule, which in turn is related to the Helmholtz free energy of the solid estimated using the Einstein model and the molar volume of the solid at a fixed pressure, which can be determined by performing constant-NpT simulations of the solid at room temperature. The chemical potential of the solution can be calculated from the derivative of the Gibbs free energy of solution with respect to the number of molecules, The Gibbs free energy can be estimated using a coupling parameter method combined with a technique such a MBAR or WHAM . The derivative is calculated numerically by performing a number of simulations at different solute concentrations. The solubility limit is obtained when the chemical potential of the solution and the solid are equal.
In the complementary area of structure activity relationships , we will discuss automatic model generation process for building QSAR models using Gaussian Processes, a powerful machine learning modeling method. We will examine the stages of the process that ensure models are built and validated within a rigorous framework: descriptor calculation, splitting data into training, validation and test sets, descriptor filtering, application of modeling techniques and selection of the best model. We will explore the effectiveness of the automatic model generation process for two types of data sets commonly encountered in building ADME QSAR models, a small set of in vivo data and a large set of physico-chemical data.
 D. Elder and R. Holm, Int. J. Pharm., 453, 3-11 (2013)
 D. Elder, R. Holm, R. de Diego, H. Lopez, Int. J. Pharm., 453, 88-100,(2013).
 I. Nezbeda, F. Moucka and W. R. Smith, Molecular Phys, 1665-1690 (2016).
 J. L. Aragones, E. Sanz, and C. Vega, J., Chem. Phys. 136, 244508 (2012).
 R. Gozalbes, , A. Pineda-Lucena, Bioorg Med Chem, 18, 7078–7084, (2010).
 J. Shaoxin Feng and Tonglei Li, J. Chem. Theory Comput., 2, 149-156 (2006).