Technical Debt in Scientific Research Software

I am going to take a phrase out of Technical Debt: From Metaphor to Theory and Practice – “hinders future development”- as my primary focus for technical debt []. Using this definition the failure to deliver a promised feature constitutes technical debt only if the failure puts the future of the project in jeopardy. It does allow undiscovered problems to contribute to the total debt. Other issues such as bad coding practices also constitute technical debt only if the effort to repair them later threatens the future viability of the project.
The scope of my interest includes the entire ecosystem in which a project is sited. Most projects not only produce technical debt, they are influenced in some way by the technical debt associated with the assets they consume. A socio-technical ecosystem includes a comprehensive supply network, which may be the source of significant technical debt, and the consumers of the products, who may be significantly impacted by technical debt, and even their competitors who may force a product release earlier than anticipated. In the current case competitors would be research groups attacking the same problems and competing for research funds from the same sources.
We have spent the last two years exploring the software ecosystems created around collaborative scientific research projects [ ] []. Most of these projects are based in a research group at one university, which manages a set of collaborators at other organizations. These projects accumulate a significant amount of technical debt due to ineffective planning, lack of knowledge of software development, and lack of quality assurance. Even the very largest scientific projects like the CERN Super Collider and projects at national laboratories that hire professional programmers still accumulate technical debt [].
Some technical debt is to be expected in a research project since there is a large amount of uncertainty associated with planning and carrying out research activities. Time-boxed schedules are widely used in industry and most agile sprints are time-boxed. Function-boxed schedules are less well-timed but widely used in research projects since it is easier to identify the functions that need to be computed than to estimate how long it will take to build them. Neither of these approaches reduces uncertainty in how long it will take to produce a product but time-boxed schedules do provide a more constrained environment in which progress is assessed regularly and work-in-progress is easily identified.
Briefly, here are several observations we have made that appear to influence the amount of technical debt in an ecosystem:
• Much scientific software is developed by persons trained in scientific disciplines but not trained in software development. Often these people are also focused on a very small piece of the total system.
• Architecture debt results from the perspective that an experiment must be well designed but software that is cobbled together and works for one time use is good enough.
• Work is managed as projects with project managers (graduate students) having short term responsibility as opposed to having product managers responsible for the long term viability of the product;
• These collaborative scientific research projects usually have informal governance structures among the collaborating research projects. Decisions are reached informally by consensus of the entire group or the affected sub-group and decisions often change rapidly or are ignored.
• Often mistakes in scientific software are not discovered until the computational results and the real world become sufficiently inconsistent to demand attention.
To be continued…