# Workshops

## Thinking outside the box - beyond machine learning for quantum chemistry

October 7, 2019 to October 11, 2019
Location : CECAM-DE-MM1P

### Organisers

• Thomas Frauenheim (University of Bremen, Germany)
• Marcus Elstner (Karlsruhe Institute of Technology, Germany)
• Anatole von Lilienfeld (University of Basel, Switzerland)
• Benjamin Hourahine (University of Strathclyde, United Kingdom)
• Nebgen Benjamin (Los Alamos National Laboratory, USA)

### Description

The field of machine learning (ML) is already making rapid and tremendous impact at the interfaces of the traditional disciplines of Chemistry, Physics, Biology and materials science. Its ability to use existing examples to rapidly make meaningful predictions in new cases offers a new way to screen wide ranges of structures and to estimate the results of highly accurate methods at much reduced cost. However, there are several issues which require careful thought in deploying these tools. Firstly, reproducibility in the training of models is a current topic of active debate receiving substantial attention and within the last year calls for more physical based approaches are beginning to appear. Then issues of the explainability and explicability of the predictions also matter, particularly with some of the more powerful ML methods. Finally there are problems with additivity to models: learning new cases tends to overwrite existing expertise and predicting properties and responses outside of the original model are not usually possible.
A counterpoint to these methods is the experiences of the past 20 years with approximate quantum mechanical methods [1], which now represent an essential part of computational tools for a solid atomistic understanding of a broad range of physical, chemical and biological problems for both large and challenging systems. These methods are parameterized, but can provide a clear physical understanding of complex structures and processes. Additionally, they can readily be extended to calculate properties and systems outside of their original parameters and fitting sets. However, this commonly comes at the cost of substantial Human effort to parameterize and test these models, providing substantial opportunities for ML.

DFTB
The DFTB approach provides modular components within other academic and/or commercial software products, including DFTB+[19], ADF [20], ATK [21], DeMon [22], Gaussian [23] and Materials Studio [24], and several MM-force fields tools, eg. CHARMM [25]. This considerably enhances the spreading of the method to potential applicants in both academic settings and in the R&D of industrial companies. Overviews of some of the range of DFTB developments and extensions in the species issues of the Journal of Physical Chemistry A 111, Number 26 (2007) and Physica Status Solidi b 249, Issue 2 (2012).
The most recent DFTB developers meeting was in November 2016 to report and discuss the present status of DFTB developments in the different software products and to join forces for further improvements in accuracy, parameterization of new systems and extensions of functionality.
Trends in Machine Learning
The Journal of Chemical Physics has recently invited a special issue on Data-enabled theoretical chemistry'' which provides a comprehensive contemporary view on the field with over 40 contributions from leading scientists actively working on the integration of modern machine learning techniques into quantum chemistry [26]. The issue was motivated by preceding successes in the field such as the systematic fitting of potential energies for molecular dynamics simulations or vibrational spectroscopy [27,28]. As also reviewed recently [29], laws of Physics have been rediscovered with ML [30], atomization energies and other electronic ground-state properties of organic molecules can now be predicted with hybrid DFT accuracy [31], and clusters can be identified [32] and compounds mapped [33]. ML can also be used to discover new molecules [34] or crystals [35], and even new reactions [36]. Various properties and systems have been studied with ML, including electrons [37], chemical potentials [38], ionic forces [39], or NMR shifts [40]. By now, neural networks and Gaussian processes have demonstrably surpassed DFT accuracy when it comes to the prediction of electronic ground-state properties of organic materials [41]. Efforts to further improve and assess ML models for their application throughout compositional space are ongoing [42]. When it comes to the improvement of well established QM methods, however, ML based investigations, such as in Refs. [43], are sparse.

### References

References
[1] H. M. Senn and W. Thiel, Top. Curr. Chem. 268 (2007) 173.
[2] M. Elstner, D. Porezag, G. Jungnickel, J. Elsner, Phys. Rev. B, 58 (1998) 7260.
[3] Q. Cui, M. Elstner, T. Frauenheim, M. Karplus et al., J. Phys. Chem. B 105 (2001) 569.
[4] A. Dominguez, B. Aradi, T. Frauenheim, V. Lutsker, T. A. Niehaus, J. Chem. Theory Comput. 9 (2013) 4901.
[5] C. Koehler, G. Seifert, T. Frauenheim, Chem. Phys. 309 (2005) 23.
[6] C. Köhler, Th. Frauenheim, B Hourahine et al., J. Phys. Chem. A 111 (2007) 5622.
[7] T. A. Niehaus, S. Suhai, F. Della Sala, P. Lugli et al., Phys. Rev. B, 63 (2001) 085108.
[8] T. A. Niehaus, J. Mol. Str. THEOCHEM, 914 (2009) 38.
[9] M. Elstner, P. Hobza, T. Frauenheim et al., J. Chem. Phys., 114 (2001) 5149.
[10] B. Hourahine, S. Sanna, B. Aradi, C. Koehler, T.A. Niehaus, T. Frauenheim, J. Phys. Chem. A 111 (2007) 5671.
[11] JG. Hou, X. Zhu and Q. Cui. Chem. Theory Comput. 6 (2010) 2303.
[12] J. Reimers, G. Solomon, A. Gagliardi, et.al., J. Phys. Chem. A 111 (2007) 5692.
[13] A. Dominguez, B. Aradi, T. Frauenheim, V. Lutsker, T. A. Niehaus, J. Chem. Theory Comput. 9 (2013) 4901.
[14] M. Wahiduzzaman, A. F. Oliveira, P. Philipsen, L. Zhechkov, E. van Lenthe, H. A. Witek, T. Heine. J. Chem. Theory Comput. 9 (2013) 4006.
[15] J. M. Knaup, B. Hourahine and Th. Frauenheim J. Phys. Chem. A 111 (2007) 5637; M. Gaus, C.-P. Chou, H. Witek, M. Elstner J. Phys. Chem. A 113 (2009) 11866; Z. Bodrog, B. Aradi and T. Frauenheim J. Chem. Theory Comput. 7 (2011) 2654; M. Doemer, E. Liberatore, J. M. Knaup, I. Tavernelli, U. Rothlisberger Molecular Physics 111 (2013) 3595; M. P. Lourenço, M. C. da Silva, A. F. Oliveira, M. C. Quintão, H. A. Duarte Theoretical Chem. Accounts 135 (2016) 11; C.-P. Chou, Y. Nishimura, C.-C. Fan, G. Mazur, S. Irle, H. A. Witek J. Chem. Theory Comput. 12 (2016) 53.
[16] J. J. Kranz, M. Kubillus, R. Ramakrishnan, O. A. von Lilienfeld , and M. Elstner J. Chem. Theory Comput. 14 (2018) 2341.
[17] A. W. Huran, C. Steigemann, T. Frauenheim, B. Aradi, and M. A. L. Marques J. Chem. Theory Comput. 14 (2018) 2947.
[18] L. Shen abd W. Yang J. Chem. Theory Comput. 14 (2018) 1442.
[19] https://www.dftb.org
[21] http://www.quantumwise.com/documents/tutorials/ATK-11.8/DFTB/index.html/
[22] http://demon-nano.ups-tlse.fr/
[23] http://www.gaussian.com/g_tech/g_ur/k_dftb.htm
[24] http://accelrys.com/products/materials-studio/quantum-and-catalysis-software.html
[25] http://www.charmm.org/documentation/c37b1/sccdftb.html
[26] J. Chem. Phys, volume 148, issue 24 (2018).
[27] J. Behler and M. Parrinello, Phys. Rev. Lett. 98 (207) 146401.
[28] A. P. Bartok, M. C. Payne, R. Kondor, and G. Csanyi, Phys. Rev. Lett. 104 (2010)
136403.
[29] O. A. von Lilienfeld, Angew. Chem. Int. Ed. 57 (2018) 4164.
[30] M. Schmidt and H. Lipson, Science 324 (2009) 81.
[31] M. Rupp, A. Tkatchenko, K.-R. Mueller, and O. A. von Lilienfeld, Phys. Rev. Lett. 108 (2012) 058301; G. Montavon, M. Rupp, V. Gobre, A. Vazquez-Mayagoitia, K. Hansen, A. Tkatchenko, K-R. Mueller, O. A. von Lilienfeld, New J. Phys. 15 (2013) 095003.
[32] A. Rodriguez and A. Laio, Science 344 (2014) 1492.
[33] S. De, F. Musil, T. Ingram, C. Baldauf, and M. Ceriotti, J. Cheminf. 9 (2017) 6.
[34] E. O. Pyzer-Knapp, K. Li, and A. Aspuru-Guzik, Adv. Fun. Mat. 25 (2015) 6495.
[35] F. A. Faber, A. Lindmaa, O. A. von Lilienfeld, and R. Armiento, Phys. Rev. Lett. 117
(2016) 135502.
[36] P. Raccuglia, K. C. Elbert, P. D. F. Adler, C. Falk, M. B. Wenny, A. Mollo, M. Zeller, S. A. Friedler, J. Schrier, and A. J. Norquist, Nature 533 (2016) 73.
[37] G. Carleo and M. Troyer, Science 355 (2017) 602.
[38] K. T. Schutt, F. Arbabzadah, S. Chmiela, K. R. Muller, and A. Tkatchenko, Nat. Commun. 8 (2017) 13890.
[39] S. Chmiela, A. Tkatchenko, H. E. Sauceda, I. Poltavsky, K. T. Schutt, and K.-R. Muller,