Basic techniques and tools for development and maintenance of atomic-scale software
- Francesco Sottile (Ecole Polytechnique, Palaiseau, France)
- Yann Pouillon (Universidad de Cantabria, Spain)
- Micael Oliveira (Max Planck Institute for the Structure and Dynamics of Matter, Hamburg, Germany)
- Damien Caliste (Alternative Energies and Atomic Energy Commission (CEA), France)
- Matthieu Verstraete (Universite de Liege, Belgium)
The software illiteracy currently widely present among scientists, including young researchers, has become a problematic issue for the future of research, even if its effects are not fully visible yet. Achieving a sufficient scaling to unleash the full power of the most recent supercomputers requires a very different programming style to that which has been practised so far. In turn, this new programming style requires much higher skills than what is commonly available in scientific research groups at present. Massive dissemination regarding this matter has thus become urgent. Without several rapid, global and well-coordinated actions, the risk is that many software packages, although achieving a high level of quality on the scientific side, will simply not be able to run anymore on cutting-edge computational infrastructures within 3 to 5 years, which would be damaging for research at many levels. Furthermore, all software newly developed should take this situation into account from the earliest implementation stages, which is still far from being the case. Most scientists do not even have the technical basis to meet the challenges the near future will bring in this area, and do not know where to start from. The first step is thus to provide them with such a basis.
The main objective of this tutorial is to improve the skills of the participants to produce software which:
- is suitable for a long life cycle;
- is adapted to team development;
- features an acceptable execution speed;
- is portable;
- is re-usable;
- interoperates well with other atomic-scale software;
- performs correctly from a scientific point of view.
Despite constant progress over the last decade, many students, post-docs and researchers with a scientific background find it very difficult to start developing software. Most of the time, they can only ask for guidance from people around, with very limited success and efficiency. To address this issue, this tutorial provides an overview of the basic concepts and tools governing scientific software development, as well as a clear step-by-step procedure to implement more and more complex aspects. Our approach is very generic and suitable for various programming languages the developers may use (mainly Fortran, C/C++, Python). This tutorial is applicable to any kind of atomic-scale software and even beyond. All the example software is available under a free software licence, so that the participants can use immediately what they have learned once back in their research institutions.
Each day of the tutorial is dedicated to a specific topic, with an increasing complexity as the week goes by. Each topic is addressed through a mix of:
- overviews of basic concepts and general information;
- presentations of the implementation of these concepts in different atomic-scale software;
- introduction to related specific tools, with examples;
- detailed presentations of the structure and organisation of selected libraries and tools;
prepared hands-on exercises;
- questions & answers (Q&A) sessions;
- working sessions, during which the students work on self-defined projects.
The day starts with a Q&A session about the topic of the previous day, in order to give the students at least one night to integrate and think about what they have learned the day before. The morning is then dedicated to presentations, going from general and theoretical information to specific concepts and practical aspects. We make sure that a sufficient amount of time is given to the students to discuss informally what they are learning during the coffee breaks and at lunch time. The afternoons are dedicated to hands-on sessions, where the students can practise and face the practical and concrete aspects of what they are learning. The day finishes at a reasonable time, so that the students can attend other businesses, rest and further integrate the concepts. The very last slot of the tutorial is dedicated to general and cross-topic questions, in order to give the students the maximum opportunity to clarify all the important concepts.
This is intended to be a five-day tutorial, with 9 half-day sessions, starting Monday morning and finishing Friday at noon. It is intended for an audience of around 30 scientists involved in software development for atomic-scale simulations. The following topics will be covered:
(1) Basic concepts of software maintenance
(concepts [1-4] ; coding rules, ROBODOC ; Autotools framework) ;
(2) Version management tools
(concepts ; bzr  and svn ) ;
(3) Code re-use. Libraries.
(concepts ; NetCDF , ETSF_IO , LibXC) ;
(4) File formats. Conversion tools.
(concepts ; NetCDF  , ETSF FileFormat , ETSF_IO , XML ) ;
(5) Scripting. Introduction to Python  ;
(6) Debugging, profiling, optimizing.
(concepts - idb and gdb  ) ;
(concepts ; buildbot ).
 Software maintenance. Concepts and practice. (2003) 2nd ed. P. Grubb & A.A. Takang, World Scientific (London)
 The mythical man-month. Essays on software engineering. Anniversary edition (1995) Frederick P. Brooks, Jr. Addison-Wesley
 http://www.intel.com/software/products/compilers/docs/linux/idb_manual_l.html and http://sourceware.org/gdb/download/onlinedocs/gdb.html