Measuring and Managing Technical and Social Debt at once

Ipek is probably going to kill me 🙂 since she may still be skeptical for this social debt thingie, but, in my qualitative research experience in industry I have observed at least four times (i.e., in four different companies) an intense correlation (and, more often than not, causality) between social and technical debt. The force that Philippe, Hans and I called social debt a while ago is the result of accumulated sub-optimal socio-technical decisions (for example, choosing an interaction protocol based on wikis instead of emails) and is often resulting into circumstances that force technical debt onto would-be perfect code-bases. A trivial example I observed lately manifested with the addition of a new outsourced partner that forced architecture changes. In this sea of madness, a key research question I’m struggling to address now is:

What are the factors at play around the relation between social and technical debt? How can these be managed?

In my previous work, I stumbled more than once on this intense and often unpredictable relation resulted, for example in a series of organizational and socio-technical patterns. For example, the social debt conceptual model we developed as a result of an industrial case-study:

tdebt

Also, there were several patterns we observed focusing on software architecture… These seemed to suggest ways in which architectures are themselves a pivot for driving the discovery and often the management of both social and technical debt… most notably, the “Architecture by Osmosis” social debt pattern from [2]:

osmosis

We witnessed this very pattern iterated several times in a large industrial player extremely active in the aerospace software market… Essentially, disgruntled clients would call up operators, who would accommodate part of the required changes and alert developers, who would accommodate another part of the changes and alert architects who would change the architecture and connected decisions with little or none of the information that goes inevitably lost in the communication chain, i.e., the

osmosis

chain. In the same paper, we tried to offer a tentative metrics framework that takes several organisational and social measurements to understand what could be the impact of social debt connected to miscommunication… this leads to a key challenge, I’ll be trying to address in the near and not-so-near future, that is:

A selected subset of different metrics and patterns for technical debt should be used or correlated to address and further investigate the interactions between technical debt and its social / organizational counterparts.

[1] Damian A Tamburri, Philippe Kruchten, Patricia Lago and Hans van Vliet “Social debt in software engineering: insights from industry” Journal of Internet Services and Applications – (2015) 6:10 DOI 10.1186/s13174-015-0024-6

[2] D.A. Tamburri, E. Di Nitto, “When Software Architecture Leads to Social Debt” WICSA 2015 Montreal, QC, Canada – May 4, 2015 to May 8, 2015 ISBN: 978-1-4799-1922-2

A Tale of Three Stories on Technical Debt Estimation

1. TD is omnipresent but industry needs estimates (Industrial Report)
The report has been compiled based on feedback from a Senior Software Architect
Domain: Online marketing solutions (providing manufacturers insights for their products on retailer site).
Company profile: Startup company with offices in 3 different EE countries. Serving large enterprises in three continents. 50 employees, development team of 30 people spread in two countries.
Brief software description: Single platform comprised of numerous components. Written in C# and .NET, deployed on a cloud computing platform. Central DB where all components have access. The platform is mainly comprised of backend services used to import products from a manufacturer into the database, and then match these products to supported retailers. Crawlers deployed in Virtual Machines are used to find retailers.
Symptoms of design problems – classified according to mapping study by Li et al. (2015):
1. Deployment (Infrastructure TD) of crawlers in virtual machines makes it almost impossible to integrate them with other subsystems. They do not expose an API so other components can invoke methods.
2. Direct access to the database (Architectural TD). Each component communicates directly with the database. This makes the data fragile as invalid data can be introduced from any point. Moreover, any change in the database schema may break the dependent components.
3. Lack of tests (Test TD). There are no automated tests (unit/integration tests) for individual components. As a result, any change in the imported data affects the crawlers and there is no way to validate that new implementations will not alter existing functionality.

Do the managers and developers perceive the problems as a form of TD?
Yes they do. Actually there is no commonly agreed definition of TD in the platform, however everyone realizes the difficulty of making changes to the code and how fragile it is. All developers understand that the platform needs some kind of refactoring.

Are managers and others willing to deal with technical debt?
Not surprisingly, the main concern of managers is how fast a novel feature will be in production. Nevertheless, management clearly sees that by ‘quick and dirty’ implementation without proper design and planning, bugs show up at increasing levels. Clients’ disappointment, which is often clearly communicated to the company, is also a direct consequence. When symptoms of technical debt are discussed, management seems to perfectly understand it; however they are somehow unwilling to make extensive changes to the platform, claiming that they do not have an accurate estimate for the required effort.

Is TD getting bigger over time?
Definitely yes. Supporting new retailers requires the addition of new crawlers. In the existing design, the higher the number of retailers (i.e. the bigger the business is) the harder it is to make changes to the product matching services (all crawlers must change).

Is it difficult to determine the effort to deal with technical debt?
It’s a kind of a vicious circle: The lack of boundaries between different components of the platform renders any effort to identify and measure technical debt almost impossible. However, without convincingly accurate estimates any plans for drastic changes, such as proper modularization, are abandoned.

What is the main reason for the accumulation of TD?
The platform started as a trial product with a poor initial design. Continuous pressure for the incorporation of new features and a marketing approach claiming an “instant- development time” model led to the accumulation of TD in all components. Moreover, a lack of processes especially during the initial stages resulted in incompletely specified requirements, poor architectural documentation and inefficient testing.

2. Are TD estimates accurate? (Academic Experience)
Context: Two CRUD web applications with state-of-the-art technology have been developed (Java-based enterprise applications, one with the Spring Web MVC and one with the Apache Struts 2 framework). Systems have evolved over successive ‘versions’ by gradually adding features. System was relatively small (~36 classes). Development time was approximately 25 man-days for each application.
Goal: a) To measure TD resulting by following strict programming practices imposed by the employed frameworks on typical Web applications. b) To investigate whether TD increases with the passage of versions.
Means: TD measured in both applications by SonarQube platform according to the SQALE methodology. Type of suggested refactorings and actual time to resolve reported issues have been recorded.
Results:
Spring-based system – TD: 1d 4h (only Major, Minor, Info issues) -> Actual time to resolve: less than 2 hours
Struts-based system – TD: 1d 6h (only Major, Minor, Info issues) -> Actual time to resolve: less than 2 hours

Observations
• Framework-based development does not lead to blocker/critical issues (limited TD)
• No tremendous improvement achieved by repaying TD
• Required effort was significantly lower than estimates

3. Estimating Technical Debt (our vision)
We believe that assessing software quality by universal thresholds is risky and flawed: for example, prohibitively high complexity in one domain (e.g. Information Management Systems) might be completely reasonable in another domain (e.g. image processing).
TD should be assessed realistically, that is against the potentially achievable levels of TD for each system under study. Let us assume any model where TD is assessed by a function that considers various parameters of interest. Such a quantifiable measure can serve as a fitness function to drive an optimization approach (Harman & Clark, 2004), that is, a process of obtaining the design/system which optimizes the selected fitness function.
Having the fitness function value for the ‘actual’ and the ‘optimum’ system one can determine the distance between the two. This is a single figure which can quantify the principal of TD. Under certain conditions, one can determine the required actions (refactoring effort) to move the actual system to the optimum position, i.e. to cover the entire or part of the distance.

Vision

Benefits from this perspective (measuring TD as distance):
• TD is realistic: it is assessed against the potentially achievable quality for any system depending on its characteristics
• TD can be quantified by automated and thus consistent means
• TD principal can be mapped to actual refactoring activities required to cover the distance
• Based on historical data, one can reason about the benefit of addressing TD and find out how much maintenance effort is saved
• It becomes possible to assess not only negative but also positive contribution to TD, i.e. actions that reduce TD

References
Harman, M., & Clark, J. (2004). Metrics are fitness functions too. In 10th International Symposium on Software Metrics. Proceedings, pp. 58–69. http://doi.org/10.1109/METRIC.2004.1357891
Li, Z., Avgeriou, P., & Liang, P. (2015). A systematic mapping study on technical debt and its management. Journal of Systems and Software, vol. 101, March 2015, pp. 193-220.

How to measure architectural technical…

How to measure architectural technical debt?

Context
My company (ABB) conducted a survey among hundreds of developers, architects, and product managers, asking about the source of technical debt in their products. Many participants pointed to “poor architecture choices” as an important source for technical debt claiming that they do not have repeatable process to deal with such issues. The presence of poor architecture choices first needs to be established in an objective manner. Then costs and potential benefits for addressing the item need to be associated with each debt items. Afterwards a prioritization for repaying the architectural debt items can be made.

Research Question
How to measure architectural technical debt reliably, objectively, and in a repeatable manner?

Own Work
As architectural documentation is often not in a good condition to be analyzed, the source code and the architects and developers themselves are usually the best information source for measuring architectural debt. Existing source code analysis tools (e.g., SonarQube, NDepend, etc.) only include a few architectural metrics (Survey of Metrics), and mainly focus on design-level issues. We have extended and applied such tools on large-scale ABB software systems with varying results (IEEE Software 2013).
We have also worked on formal structural architecture models (IEEE Software 2011), as well as on formal architecture decision models, as a prerequisite for automated technical debt analysis (WICSA 2014, WICSA 2015, available as Open Source at CodePlex and GitHub).
To interview architects about technical debt, ATAM workshops have proven to be a useful tool to overcome a lack of communication that often leads to technical debt.

Related work
Architecture metrics for source code which are proposed in literature (e.g., Sakar2007, Bouwers2009) often require some subjective parameters as input therefore the metric values may not be comparable across products. Architecture decisions that are not or only indirectly captured in source code, e.g., the choice of a third-party library or the architectural constraints to assure conceptual integrity, are difficult to capture and analyze for architectural debt. Some case studies on architectural debt have been carried out (Bouwers2013, Martini2015).

Future work
More and better architecture metrics need to be established and validated for source code, design documents and requirements. Community benchmarks based on Open Source Systems could help improving existing metrics.
To analyze the technical debt coming out of poor architecture decisions, there is a need to capture decisions more formally, without overwhelming the users. Technical debt analysis tools for such models incorporating software economic models would be the next step.

Hi all, first of all…

Hi all, first of all have a great 2016!

Key challenge

There is a need of a Technical Debt prioritization mechanism to compare items among themselves for decision making.
I’ve been working in contact with several practitioners who struggle deciding what to refactor and when. The challenge is having fixed resources and hours or having to subtract such hours (and motivate such subtraction) from feature development. Usually companies have more TD than it’s possible to fix, and deciding becomes extremely difficult, especially for big TD items such as the Architectural ones.

Our recent work

We have written a paper that has been recenlty accepted at ICSE, SEIP track 2016 (A. Martini, J. Bosch, “An Empirically Developed Method to Aid Decisions on Architectural Technical Debt Refactoring: AnaConDebt”).
We have analyzed 12 cases of Architectural TD in 6 companies and we have developed and evaluated a decision-making approach based on the ratio principal/interest. Such ratio allows the comparison of TD items among themselves and with respect to different points in time (which tells the practitioners if it’s convenient to refactor now or later). We found that this approach, after several improvements steps, was very appreciated by the practitioners who are actually introducing it in practice. However, we found that both factors (principal and interest) need to be split down in several components (and usually there are many involved). First it’s important to find all the components constituting principal and interest (and a checklist is very much appreciated by the practitioners). We also found that it’s difficult to anticipate all the interest (especially the long-term one) in the beginning, and therefore it’s better to implement an iterative process that takes in consideration different evolution of TD using a roadmap (which is subject to variation). Also, the study shows that, if present, the interest on the principal makes a TD item less convenient to be paid over time and this is crucial information. We have developed some indicators that can help the practitioners taking the decision. However, we also found that metrics need to be mixed with qualitative information in order to provide an acceptable overview for decision-making, and that this step is not obvious.

Other work

The main idea of prioritizing technical debt using a cost/benefit indicator has been introduced in a short paper published at the Technical Debt workshop by Guo and Seaman, but I haven’t found much follow-up work providing a comprehensive indicator to evaluate continuously a large TD item. It seems that we researchers have been focusing on single measures but not on an overall decision-making process, which is very needed.

Future work

With the recent study we have done a first step. In the cases we used some measurements, but the availability and units of the metrics depend on the specific company. However, the high-level components constituting principal and interest seem to be quite generic (a part from a few specific ones). Going from basic measurements to an indicator-for-decision is not obvious and needs to be better studied, especially when a number of measures / qualitative information related to many factors are involved in calculating principal and interest.

Sustainability Debts The recent manifesto…

Sustainability Debts

The recent manifesto for sustainable design has identified a number of dimensions that can directly or indirectly influence the sustainability and longevity of a software. Roughly speaking, these dimensions can relate to the individual, society, economics, environment and technical.

Claim:
It can be argued that a sustainable software shall deliver and sustain its value across these dimensions – while the system is in operation and as it evolves. The presence of debt on any of these dimensions is a threat on sustainability and may call for phasing out the software.

Research Questions:
How does the concept of debt and interest on the debt relate to the sustainability dimensions?
How can we predict, quantify and visualize the debt across these dimensions?
How can we manage the debt across these dimensions: negotiate and reconcile the conflicting objectives?

Closely Related Effort:
B. Ojameruaye and R. Bahsoon(2015). Sustainability Debt: An Economics driven approach for Using Technical Debt Analysis in Decision Making for Sustainable Requirements. CSR-15-03 Technical Report. School of Computer Science, University of Birmingham, UK (under submission – ICSE Software Engineering for Society).

Technical Debt Through Analogy: Call…

Technical Debt Through Analogy: Call for “Twin Assets”, Benchmarks, Artifacts and Repositories

Research Questions:

How can we predict the likely technical debt before we build the software? Can estimation through analogy, which is respectable in science, help here? How can the concept of “twin assets” in finance serve the above objectives? How can we identify these twins? The twins can relate to various concerns – how can we decide on these concerns, manage and reconcile the relative debts?

Need:
As a community, we may need to define benchmarks, call for artifacts and repositories which can assist the estimation of debts through analogies. The infrastructure can be specifically useful in estimating debts that can relate to non-functionalities – which are difficult to predict and visualize even in existing software.

An Example – Closely related work:

Though the work does not mention “technical debt” in its exposition, the treatment of debt is implicit. We had used spec.org (Performance benchmark repository). We had estimated the value of architecture decisions under uncertainty – relative to scalability scenarios of interest:

R. Bahsoon and W. Emmerich: An Economics-Driven Approach for Valuing Scalability in Distributed Architectures, WICSA 2008

Drawing on a case study that adequately represents a medium-size component-based distributed architecture, we show how existing performance repositories could be mined to value the ranges in which a given software architecture can scale to support likely changes in load. The mining is based on a financial analogy, where we utilize the concept of twin asset in financial engineering to justify mining relevant repositories. The mining process in then complemented with real options analysis for predicting the values resulted from the ranges in which an architecture can scale under uncertainty, where uncertainty is attributed to the unpredicted change in load. As the exact method for analyzing scalability is subject to debate, we focus the analysis on throughput as a way for measuring scalability. Using options analysis, we report on how ranges in which an architecture can scale, can inform the selection of distributed components technology and subsequently the selection of application server products.

I have recently tailored the above analysis to the benefit of cloud service selection. Work published in MTD 2013, UCC 2013 and Australian Software Engineering 2014.

A process for managing Architecture Technical Debt

In our previous work we had proposed a process for Architecture Technical Debt Management (ATDM).
Figure 2

The details of each ATDM as well as their input and output in the architecting process are decribed below.
1. ATD identification detects ATD items during or after the architecting process. An ATD item is incurred by an architecture decision, thus, one can investigate an architecture decision and its rationale to identify an ATD item by considering whether the maintainability or evolvability of the software architecture is compromised.
2. ATD measurement analyzes the cost and benefit associated to an ATD item and estimates them, including the prediction of change scenarios influencing this ATD item for interest measurement. For interest measurement, three types of change scenarios are considered: (1) the planned new features according to the version plan of the software project; (2) the already-known maintenance tasks that enhance specific QAs (except maintainability and evolvability) of the implemented software architecture; and (3) the emerging requirements. The first two types of change scenarios can be predicted while the rest one is unforeseeable. For some complex software systems (e.g., operating systems), the time interval between two releases can be very long. For instance, Microsoft Windows 7 Service Package 1 was released 16 months after the first release of Microsoft Windows 7. For such kind of software systems, it is inevitable that new requirements emerge during the development of a new release. Some of these new requirements need to be implemented in the release. Thus, in such cases situation, to ensure a reasonable accuracy of interest measurement, the interest of related ATD items should be re-measured at different times during the development of the release.
3. ATD prioritization sorts all the identified ATD items in a software system using a number of criteria. The aim of this activity is to identify which ATD items should be resolved first and which ones can be resolved later depending on the system’s business goals and preferences. There are a number of ATD items in a software system and all the ATD items will not be resolved at one time due to their cost or technical issues. The ATD items have different financial and technical impacts on the system. Consequently, it is wise to choose the items with higher priorities to be resolved first. Software projects have different context, and there are no standard criteria to decide the priority of an ATD item in a project. However, the following factors need to be taken into account in ATD prioritization: (1) the total cost of resolving an ATD item; (2) the cost/benefit ratio of the ATD item; (3) the interest rate of the ATD item; (4) how long the ATD item has been incurred; (5) the complexity (e.g., the number of involved components of an ATD item) of resolving an ATD item. Since not all types of benefits can be measured in a unified metric, it is hard to automatically prioritize the ATD items by tooling. However, an appropriate tool, which reasonably deals with the factors described above, can facilitate ATD prioritization.
4. ATD repayment concerns making new or changing existing architecture decisions in order to eliminate or mitigate the negative influences of an ATD item. An ATD item is not necessarily resolved at once. In certain situation, only part of an ATD item is resolved, because it could be too expensive to resolve the entire ATD item, and resolving part of the ATD item can make the ATD item under control with an acceptable cost. When an ATD item is partially resolved, the ATD item will be revised and split into two parts: the part that is resolved and the part that is not.
5. ATD monitoring watches the changes of the cost and benefit of unresolved ATD items over time. When an architectural change happens in the part of architecture design containing an unresolved ATD item or when one ATD item is partially resolved, the affected ATD item will be recognized as a changed ATD item. All the changed ATD items will be measured in the next ATDM iteration. This ATDM activity makes ATD changes explicitly and consequently keeps all the ATD items of the system under control.

References

  • Z. Li, P. Liang, P. Avgeriou, Architectural Debt Management in Value-oriented Architecting, in I. Mistrik, R. Bahsoon, Y. Zhang, K. Sullivan, R. Kazman (eds.), Economics-Driven Software Architecture, Elsevier, 2014, Pages 183-204.