Community-Based Repository of Tools to Support Empirical Research

As with much of the rest of software engineering, the current state-of-the-art research in the area of architectural technical debt and software architecture in a broader sense are impeded by the myriad disjoint research and development environments. The resulting “one off” solutions inhibit further advances, and make it difficult to systematically synthesize novel research techniques on top of existing ones and to cross-validate those techniques. As a result, researchers and practitioners needing to build cutting-edge architecture-based tools must often create their building blocks (e.g., software components and frameworks) from scratch. In doing so, they tend to unnecessarily repeat each other’s efforts and even to revert to solutions that had already been tried and established as ineffective.

We have identified the following five key challenges which are faced by the software engineering community, when conducting research in the area of technical debt as well as software architecture:

1 – MTD Tool Accessibility and Reusability of Research Techniques. Implementations of research techniques and tools are often not easily accessible, unavailable, defective, or are no longer supported by their original creators. For tools that do work, it is common for them to not operate as advertised, resulting in major effort required to adapt these tools for further MTD research.

2 – Lack of Benchmarks and Datasets. Access to and construction of public artifacts, case studies, and benchmark datasets are challenges shared across the field of software engineering. For the research in the area of technical debt, these challenges are particularly pronounced due to the fact that many of the factors contributing to technical debt (e.g design decisions) have a tendency to be undocumented. For instance software architecture artifacts embody significant amounts of expert knowledge. In practice, developing such artifacts is expensive, organizations are reluctant to share such artifacts, and monopolizing architectural knowledge may create a perception of job security, further disincentivizing the construction and maintenance of such artifacts. As a re sult, the research community often relies on small datasets, lacking the domain- or application-specific knowledge needed to reconstruct accurately most factors contributing or impacting technical debt. It is thus necessary to develop an instrument that can aid in the generation, storage, and sharing of such artifacts by reverse engineering existing large-scale open-source systems. In earlier work, we examined the practicality of using various reverse-engineering approaches to obtain ground-truth architectures to partly address this challenge.

3 – Interoperability of Tools. Technical Debt research is hampered by distributed research environments and stove-piped solutions emerging from different research groups. This, in turn, inhibits research advances, makes it difficult to synthesize techniques and tools in new and exciting ways, and complicates comparisons of research solutions. Researchers and practitioners in need of cutting-edge technical debt analysis must often recreate tools or their major elements, including basic code analysis, reverse-engineering functions, and frameworks. Furthermore, different assumptions that these tools make (e.g., about the execution environments, formats used, implementation languages, etc.) prevent their combined use, further inhibiting breakthroughs.

4 – Reproducibility of Experiments and Analyses. Due to inaccessible, non-reusable, or defective tools, datasets, and case studies, and incompatible underlying tool assumptions, it is difficult to reproduce the results of many previous software architecture-oriented and technical debt research studies. In software engineering research, reproducibility is often rendered too difficult or impossible, even for studies designed to be repeatable.

5 – Technology Transfer. Despite the fact that practitioners understand, appreciate, and emphasize the criticality of software architecture as well as managing technical debt in the success of software systems, technology transfer in this area is hindered by the fact that most prototype tools are not sufficiently mature to support production-level or industrial usage. An overwhelming majority of software- engineering research groups lack the resources (e.g., personnel and hardware) needed to build tools that are robust and scalable enough to be easily and effectively used by other researchers, let alone by industry-grade software projects.

I would like to acknowledge my collaborators on this project, Nenad Medvidovick, Sam Malek, Josh Garcia for their contribution in summarizing the community-based challenges and formulating solutions.

Questions that are quite systematically raised about TD

During my consulting engagements within large organization, I meet senior managers and exchange with them on the topic of technical debt. I would like to share with you:
• The questions that are quite systematically raised by my contacts
• My current answers to their questions.
• My suggestions, my proposal about what needs to be done to make progress?

Question 1:
Is technical debt just a new fad that will pass in a few years?

My Current answer:
No, the concept of technical debt is a true paradigm shift. Once you have adopted this measurement concept, you won’t come back to traditional measurement system for code quality. This is due at least for 2 reasons that are purely mathematical.
The first reason is that technical debt is measured on a ratio scale (principal and interest are unlimited numbers. A file with X0 days of debt has X times more debt than a file with 10 days debt).
Almost all code quality measurement systems that the community has proposed during the last 30 years was producing measures, indexes on limited intervals like [1 to 5], [0 to 10], [0 to 100]. (I remember the MI3 and MI4 maintainability indices). Such measurement systems have representation issues and should be replaced by ratio scale measure. In fact, if we leave the software world and look at the measures we use every day for centuries (weight, distance, area …), they all are ratio scale measures.
The second reason is that technical debt is aggregated with additions. This is the only aggregation rule compatible with a representative measurement system (which means that if the analysis tools are accurate, the system will not make false positives, when aggregating file level measures to build a module or application level measure).
This looks quite short explanations about these two reasons. If you want more details, I invite you to read an article that I published in 2010 and which covers more in detail the measurement theory applied to code measures: Valid-2010 Conference paper downloadable here.

What needs to be done to make progress about the topic raised by this question?
I personally think that this paradigm shift is a huge progress, a major step toward the systematic measure of code quality. I think that the TD expert community does not communicate enough about this major breakthrough achievd by the concept of technical debt.

Technical Debt Conceptual Model

We, the technical debt research community, agree that a common conceptual model of technical debt that we collectively improve and validate would increase the pace of technical debt research. Therefore, as organizers we felt it is important to tease this apart together during the workshop. Early conceptual models offered by Martin Fowler (the debt quadrants) and Steve McConnell (intentional versus unintentional debt) provided useful starting points, but do not suffice to guide answering the hard questions for eliciting, quantifying, and reducing debt and transitioning to developers validated, easy to adopt practices.

Different technical debt enthusiasts refer to this semantic model in different ways: “technical debt framework”, “technical debt landscape”, “conceptual model”, “empirical model”, “financial model” ,“quality model”, “measurement model”.  The concepts discussed in these models are not consistent either.  Is design debt the same as architectural debt? If defects are not technical debt, what are postposed defects? Does principal of debt map to all code quality violations? Does principal change? What are the attributes of interest?

The underlying goal of all these models are common, to guide defining technical debt concepts and creating methods to control the inputs and outputs for managing it. Several blog posts here already refer to the conceptual model.  In addition, there are several papers already published that can help shape a strawman conceptual model of technical debt.  We compiled a reading list to help us all prepare for our sessions during the workshop when we discuss the conceptual model.

We believe that a baseline model will help the technical debt community make collective progress rather than coming up with yet another model variation. The reading list is meant to be representative rather than all-inclusive. If we have skipped a fundamental work that should be included comment and we will add it.

All the papers referred to are here: Ipek TD papers (in a zip file).

Systematic literature reviews and technical debt landscape

Chen Yang, Peng Liang, Paris Avgeriou:
A systematic mapping study on the combination of software architecture and agile development. Journal of Systems and Software 111: 157-184 (2016)

Areti Ampatzoglou, Apostolos Ampatzoglou, Alexander Chatzigeorgiou, Paris Avgeriou:
The financial aspect of managing technical debt: A systematic literature review. Information & Software Technology 64: 52-73 (2015)

Zengyang Li, Paris Avgeriou, Peng Liang:
A systematic mapping study on technical debt and its management. Journal of Systems and Software 101: 193-220 (2015)

Edith Tom, AybüKe Aurum, and Richard Vidgen. 2013. An exploration of technical debt. J. Syst. Softw. 86, 6 (June 2013), 1498-1516.

Nicolli S. R. Alves, Thiago Souto Mendes, Manoel Gomes de Mendonça Neto, Rodrigo O. Spínola, Forrest Shull, Carolyn B. Seaman:
Identification and management of technical debt: A systematic mapping study. Information & Software Technology 70: 100-121 (2016)

Clemente Izurieta, Antonio Vetro, Nico Zazworka, Yuanfang Cai, Carolyn B. Seaman, Forrest Shull:
Organizing the technical debt landscape. MTD@ICSE 2012: 23-26

Philippe Kruchten, Robert L. Nord, Ipek Ozkaya:
Technical Debt: From Metaphor to Theory and Practice. IEEE Software 29(6): 18-21 (2012)

Comparative studies on debt identification:

Nico Zazworka, Antonio Vetro, Clemente Izurieta, Sunny Wong, Yuanfang Cai, Carolyn B. Seaman, Forrest Shull: Comparing four approaches for technical debt identification. Software Quality Journal 22(3): 403-426 (2014)

Griffith I., Reimanis D., Izurieta C., Codabux Z., Deo A., Williams B., “The Correspondence between Software Quality Models and Technical Debt Estimation Approaches,” IEEE ACM MTD 2014 6th International Workshop on Managing Technical Debt. In association with the 30th  International Conference on Software Maintenance and Evolution, ICSME, Victoria, British Columbia, Canada, September 30, 2014.

Case Studies:

Griffith I., Izurieta C., Taffahi H., Claudio D., “A Simulation Study of Practical Methods for Technical Debt Management in Agile Software Development,” Winter Simulation Conference WSC 2014, Savannah, GA, December 7-10, 2014.

Antonio Martini, Lars Pareto, Jan Bosch:
A multiple case study on the inter-group interaction speed in large, embedded software companies employing agile. Journal of Software: Evolution and Process 28(1): 4-26 (2016)

Ariadi Nugroho, Joost Visser, and Tobias Kuipers. 2011. An empirical model of technical debt and interest. In Proceedings of the 2nd Workshop on Managing Technical Debt (MTD ’11). ACM, New York, NY, USA, 1-8.

On the Interplay of Technical Debt and Legacy

For each instance of technical debt, the identification, assessment, and optimal route of governance between short- and long-term yields is unique [1]. Commonalities do, however, exist. One of these is the ways with which technical debt is accumulated for the project. McConnell [2] identified intended (i.e. strategic) and unintended (i.e. accidental) accumulation which describe the two variations of the immediate situation wherein technical debt is accumulated. Arguably, however, there is also a third way which is delayed accumulation.

All software products are static. That is, after they have been developed, prior to being developed again, they remain in the exact same state (formalized for technical debt by Schmid in [3]). The environment around them, however, is dynamic. Technologies, people, organizational structures and processes change. All these and many others can be seen to have a link back to the static software product. As an explicit example, continued updates to a technology that is used to implement a software product: here the software product, for which development has stopped, does not abide to the latest version of the technology, and it becomes detached from the environment’s assumptions as they are no longer delivered via the technology’s updates. Hence, when the development is continued for this software product, we note that it has accumulated technical debt in a delayed fashion as current assumptions do not apply for it.

From the management perspective, there is a considerable difference between immediate and delayed accumulation. Immediate accumulation is affected mainly by matters that reside within the producing organization and its project. We may look into altering strategies and implementing new processes to affect the management of immediate, intended technical debt. Management of immediate, involuntary debt is often more indirect. For example, implementation and design quality issues often arise from practitioners having communication issues or being unaware of all applicable best practices. In these scenarios exercises to enhance social togetherness and focused training, respectively, can be utilized.

As per the previous description of delayed technical debt accumulation, its management is not limited to the producing organization only. Rather, the whole environment affects it; something that is impossible to subject to management and must be accepted to cause problems in the future. Within the fault management domain, efforts in this area are categorized under fault tolerance. Within the software development and maintenance domain, arguably, this is very close to legacy software management.

Legacy software has a variety of definitions, but it generally captures software artifacts which can not be subjected to the same maintenance and management efforts as newly created artifacts [4]. In practice, these are often implementation artifacts which are old, undocumented and/or untested, and for which the original developer is no longer available–either a new team has taken over in the organization, or the implementation has been acquired from somewhere else. If we consider delayed technical debt accumulation to capture inoptimalities that emerge due to the environment progressing around a static software product, we could argue that legacy software is a very close match to it.

As an effort to shed light into the accumulation and composition of technical debt that software organizations face today, and, especially, to probe the close relation of delayed technical debt accumulation and legacy software further, we conducted a practitioner survey. This survey was administered as a web-based questionnaire in Brazil, Finland, and New-Zealand. We captured a total of 184 responses from a diverse set of respondents using both agile and traditional development methods in which the practitioners assumed several roles ranging from developers to managers and client representatives. We have discussed results of the Finnish survey in more detail before [5] while a forthcoming article reviews the multi-national results. The multi-national set captured 69 descriptions for a concrete technical debt instance. Let us look at the distribution captured for the instances’ origins.


Figure 1: Origins of technical debt instances (N=78 as multiple origins were indicated for some instances)

We see from Figure 1 that over 75% of captured technical debt instances have indicated origins in software legacy. Whilst acknowledging that there is a number of limitations affecting for example the results generalizability, using this distribution as a basis, we would like to discuss the interplay of technical debt and legacy further.

It is evident that there is a strong connection between technical debt and legacy as most technical debt instances are affected by it. Hence, technical debt could be seen to benefit from integration of legacy software management procedures, as this field has a very established status. However, there are matters that should be explored when legacy software methods are integrated into technical debt management. Firstly, is legacy software, as per the close similarity, only a component of delayed technical debt accumulation? Arguably not, as the overall current state of the software product–to which legacy can be counted to–has an effect on technical debt accumulation and management [6].

Second, legacy software is generally a negative term for “derelict code” while technical debt implies pursuing asset management functions for inoptimalities in varying software artifacts. Reviewing Figure 1, one identifies that there is a potential danger in legacy being re-branded under the more favorable technical debt concept. Here, the asset management possibilities are left unexplored if legacy is not diligently converted into technical debt instances enabling full management. Noting the unobtrusive nature of legacy software, this is not an easy task to do, and having technical debt instances with varying levels of accuracy is bound to deteriorate technical debt management efforts overall.

Either way, as per the limited view provided by our survey, legacy is a very close companion of technical debt, and we should pursue narrowing the gap between these fields. While total control over technical debt is extremely challenging to ascertain, finding that technical debt management is the key to sustainable and efficient software development should still motivate us to pursue it.

[1] N. Brown, Y. Cai, Y. Guo, R. Kazman, M. Kim, P. Kruchten, E. Lim, A. MacCormack, R. Nord, I. Ozkaya et al., “Managing technical debt in software-reliant systems,” in Proceedings of the FSE/SDP Workshop on Future of Software Engineering Research. ACM, 2010, pp. 47–52.
[2] S. McConnell, “Technical debt,” 10x Software Development Blog,(Nov 2007). Construx Conversations. URL= http://blogs. construx. com/blogs/stevemcc/archive/2007/11/01/technical-debt-2.aspx, 2007.
[3] K. Schmid, “A formal approach to technical debt decision making,” in Proceedings of the 9th International ACM SIGSOFT Conference on Quality of Software Architectures. ACM, 2013, pp. 153–162.
[4] M. Feathers, Working effectively with legacy code. Prentice Hall, 2004.
[5] J. Holvitie, V. Lepp¨anen, and S. Hyrynsalmi, “Technical debt and the effect of agile software development practices on it-an industry practitioner survey,” in Sixth International Workshop on Managing Technical Debt. IEEE, 2014, pp. 35–42.
[6] A. Nugroho, J. Visser, and T. Kuipers, “An empirical model of technical debt and interest,” in Proceedings of the 2nd Workshop on Managing Technical Debt. ACM, 2011, pp. 1–8

Does Principal Grow?

Understanding the technical debt metaphor (1) means that inherently when we deliver software there is some amount of technical debt in the code that hinders on going evolution and maintenance activity. At a basic level of understanding, one is tempted to consider the instances of technical debt in a software product as static entities that do not change as the software evolves. However, as software undergoes maintenance and development activity, prior decisions to take on technical debt could lead to further accumulation of debt.

Consider an example of a subject application that has a dependency on a specific version of third party software. Developers code the application using the API library of that software. After the subject application is released, the third party software releases a new version. The developers decide to stay with the old version of that software rather than upgrade because upgrading changes the API. This is the first instance of incurring technical debt. Each release of the subject application builds further capability using the old API. The third party software also has further new releases which the team decides to ignore each time. Thus, more technical debt principal is accumulated whenever the decision to defer an upgrade is made. This combined with the interest cost of working with the old libraries creates a multiplicative effect on technical debt making the cost to pay the debt climb higher than the original decision ever planned for.

A similar example is provided by Nughoro et al. (2) who estimate that debt increases as the size of the software increases during maintenance efforts. They estimate software growth in their model as a percentage of effort applied to maintenance. The growth is modeled in an equation that results in linear growth of the debt during maintenance. While this may be the case, we should also consider other causes of debt growth such as proposed here of dependencies on antiquated software. Zazworka et al. (3) discuss technical debt interest payments in terms of additional defects and changes for code with increased technical debt. These are factors that contribute to interest payments, but we should also consider factors that increase principal.

Considering the idea of increasing principal raises the question of what the growth rate of the principal might be and what the contributors to principal growth are. When decisions are made to put off upgrading, to put off paying technical debt, and to delay fixing defects, they result in an increase cost of maintenance. In the first case we outlined here each decision not to upgrade the library results in more future work to migrate to the current version. Decisions to put off paying technical debt and to delay fixing defects are related such that developers will take extra time (interest) to work around the debt or defect when making future changes, and they will accumulate more debt in the code that performs the workaround that must be modified when the original debt is paid down at a future date. Could the function become exponentially increasing due to decisions constantly made to work around existing instances of technical debt? This is a proposed topic of future study determining what the growth rate possibilities are in this type of example and other examples.

1. P. Kruchten, R. L. Nord, and I. Ozkaya, “Technical debt: From metaphor to theory and practice,” Software, IEEE, vol. 29, no. 6, pp. 18-21, Nov. 2012.
2. A. Nugroho, J. Visser, and T. Kuipers, “An empirical model of technical debt and interest,” in Proceedings of the 2Nd Workshop on Managing Technical Debt, ser. MTD ’11. New York, NY, USA: ACM, 2011, pp. 1-8.
3. N. Zazworka, A. Vetro’, C. Izurieta, S. Wong, Y. Cai, C. Seaman, and F. Shull, “Comparing four approaches for technical debt identification,” vol. 22, no. 3, pp. 403-426, 2014.

Beware of overdoing the TD metaphor

Working in the area of software quality consulting and analysis tool development at CQSE, I agree that the technical debt metaphor captures some relevant aspects of software quality management: the option to decide for reduced quality (taking debt) in exchange for more features or faster time-to-market, the fact that some quality issues will hit you hard later on (interest), or the problem that accumulation of too many quality issues might make further development of a software system impossible (just as too much debt might break a
company). Consequently, our company closely follows and actively participates in the research on recognizing and managing technical debt.

Still, we try to avoid the actual term technical debt, both in our own tools and when dealing with our customers (unless, of course, when the customer already uses metaphor). Our main reason is that the metaphor is often overdone and its users tend to see too many parallels to its financial counterpart.

Technical debt can never be measured precisely

Financial debt is easily expressed in exact numbers; just check your bank account. Technical debt has a lot of fuzziness associated with it, both in the amount of debt (how “expensive” is a code clone, a god class, or an architectural flaw?) and the cost of repaying it (developer efficiency varies a lot, the complexity of a refactoring not only depends on the type of issue, but also on the context). Using the term debt, practitioners often assume an amount of precision that just does not exist. This is further promoted by some analysis tools that report a single exact number for the technical debt.

Technical debt should not be expressed in terms of money

The term technical debt is often interpreted as a way to describe quality issues in terms of money. Some approaches and tools even calculate a dollar amount for technical debt. However, it is very dangerous to compare this amount with others, such as the estimated redevelopment cost of a system. While this might seem like an easy decision — just compare two dollar values — the actual rebuild vs. refactor decision has a lot more facets.

Technical debt not always has to be repaid

In financial debt, we usually expect to pay the debt eventually. As such, the debt is something to plan for in the future. In technical debt, you might get away without paying anything at all. In fact, sometimes (e.g. by removing no longer needed features) parts of the technical debt just disappear without real effort.

Interest in technical debt is not uniform

In financial debt, you agree on an interest rate and can plan, quite precisely, the amount of money you have to pay in a couple of years. This is not true for quality issues. Many are local and can be fixed or ignored easily, while a few will hit you very hard. But most of the time, you do not know which are the bad ones. While on average, it might work to assume a fixed interest rate, the variability is pretty high.

You never get a complete view on technical debt

As stated before, as long as your accounting is in order, you can easily get an overview of your financial debt. In technical debt, many practitioners limit their view to a single aspect, such as code smells or “issues that are detected by a certain analysis tool”. Even
combining several tools and expert knowledge, it is very likely that you will miss some issues. The risk here is to assume that this view captures all of your debt and hence you focus too much on these issues, ignoring other problems completely.

Should we drop the metaphor?

While I see many risks in using the metaphor too freely, it is also a good tool for discussing certain characteristics of software quality. However, I strongly advise to use the term technical debt carefully and always keep in mind that this is only a metaphor. Thus,
drawing parallels to financial debt can be misleading and should always be checked carefully.


Technical Debt in Scientific Research Software

I am going to take a phrase out of Technical Debt: From Metaphor to Theory and Practice – “hinders future development”- as my primary focus for technical debt []. Using this definition the failure to deliver a promised feature constitutes technical debt only if the failure puts the future of the project in jeopardy. It does allow undiscovered problems to contribute to the total debt. Other issues such as bad coding practices also constitute technical debt only if the effort to repair them later threatens the future viability of the project.
The scope of my interest includes the entire ecosystem in which a project is sited. Most projects not only produce technical debt, they are influenced in some way by the technical debt associated with the assets they consume. A socio-technical ecosystem includes a comprehensive supply network, which may be the source of significant technical debt, and the consumers of the products, who may be significantly impacted by technical debt, and even their competitors who may force a product release earlier than anticipated. In the current case competitors would be research groups attacking the same problems and competing for research funds from the same sources.
We have spent the last two years exploring the software ecosystems created around collaborative scientific research projects [ ] []. Most of these projects are based in a research group at one university, which manages a set of collaborators at other organizations. These projects accumulate a significant amount of technical debt due to ineffective planning, lack of knowledge of software development, and lack of quality assurance. Even the very largest scientific projects like the CERN Super Collider and projects at national laboratories that hire professional programmers still accumulate technical debt [].
Some technical debt is to be expected in a research project since there is a large amount of uncertainty associated with planning and carrying out research activities. Time-boxed schedules are widely used in industry and most agile sprints are time-boxed. Function-boxed schedules are less well-timed but widely used in research projects since it is easier to identify the functions that need to be computed than to estimate how long it will take to build them. Neither of these approaches reduces uncertainty in how long it will take to produce a product but time-boxed schedules do provide a more constrained environment in which progress is assessed regularly and work-in-progress is easily identified.
Briefly, here are several observations we have made that appear to influence the amount of technical debt in an ecosystem:
• Much scientific software is developed by persons trained in scientific disciplines but not trained in software development. Often these people are also focused on a very small piece of the total system.
• Architecture debt results from the perspective that an experiment must be well designed but software that is cobbled together and works for one time use is good enough.
• Work is managed as projects with project managers (graduate students) having short term responsibility as opposed to having product managers responsible for the long term viability of the product;
• These collaborative scientific research projects usually have informal governance structures among the collaborating research projects. Decisions are reached informally by consensus of the entire group or the affected sub-group and decisions often change rapidly or are ignored.
• Often mistakes in scientific software are not discovered until the computational results and the real world become sufficiently inconsistent to demand attention.
To be continued…