Open Databases Integration for Materials Design
CECAM-HQ EPFL, Lausanne, Switzerland
Designing bespoke materials for specific applications is a long, complex, and costly process. Researchers propose materials based on intuition and experience. Their synthesis and evaluation require a tremendous amount of trial and error. It can take months to test a single new material, and most often the outcome is negative. In a speech on June 24, 2011, President Obama announced the Materials Genome Initiative, as a major US research priority to foster innovation more effectively.
In the last few years, there has been a major game change in materials design. Thanks to the exponential growth of computer power and the development of robust first-principles electronic structure codes, it has become possible to perform large sets of calculations automatically. This is the flourishing field of high-throughput (HT) ab initio computations . The concept though simple is very powerful. HT calculations curated into create large databases (DBs) containing the calculated properties of existing and hypothetical materials. These DBs can then be intelligently interrogated, searching for materials with desired properties and so removing the guesswork from materials design. In this framework, various open-domain DBs are available online:
the AFLOW distributed materials property repository: http://aflowlib.org
the ChemDataExtractor in Cambridge: http://chemdataextractor.org
the Harvard Clean Energy Project Database: http://molecularspace.org
the Materials Cloud: http://materialscloud.org
the Materials Project: http://materialsproject.org
the NOMAD (Novel Materials Discovery) Archive: https://metainfo.nomad-coe.eu/
the Open Quantum Materials Database: http://oqmd.org
the Computational Materials Repository: http://cmr.fysik.dtu.dk
the Data Catalyst Genome: http://suncat.stanford.edu
the Open Materials Database: http://openmaterialsdb.se
the Theoretical Crystallography Open Database: http://www.crystallography.net/tcod
Databases are also being built in the experimental community. Some of these DBs are open (e.g. at the National Institute of Standards and Technology) while other are being sold by private companies.
The current materials data landscape is quite fragmented. In some of those cases, a Representational State Transfer (REST) Application Program Interface (API) is available  to interrogate the DB through scripts (though not always documented). But, so far, it is only possible to interrogate one DB at a time and the APIs vary from one DB to another. Furthermore, the lack of data standards in materials complicates gaining insights from large-scale materials data. Flexible, uniform, computer- readable data standards should be established to enable data to be shared and systematically mined.
Gian-Marco Rignanese ( Université catholique de Louvain ) - Organiser
Markus Scheidgen ( Humboldt-Universität zu Berlin ) - Organiser
Saulius Gražulis ( Vilnius University Life Science Center Institute of Biotechnology ) - Organiser
Rickard Armiento ( Linköping University ) - Organiser
Giovanni Pizzi ( EPFL ) - Organiser
Gareth Conduit ( University of Cambridge ) - Organiser
Cormac Toher ( Duke University ) - Organiser