A Tale of Three Stories on Technical Debt Estimation

1. TD is omnipresent but industry needs estimates (Industrial Report)
The report has been compiled based on feedback from a Senior Software Architect
Domain: Online marketing solutions (providing manufacturers insights for their products on retailer site).
Company profile: Startup company with offices in 3 different EE countries. Serving large enterprises in three continents. 50 employees, development team of 30 people spread in two countries.
Brief software description: Single platform comprised of numerous components. Written in C# and .NET, deployed on a cloud computing platform. Central DB where all components have access. The platform is mainly comprised of backend services used to import products from a manufacturer into the database, and then match these products to supported retailers. Crawlers deployed in Virtual Machines are used to find retailers.
Symptoms of design problems – classified according to mapping study by Li et al. (2015):
1. Deployment (Infrastructure TD) of crawlers in virtual machines makes it almost impossible to integrate them with other subsystems. They do not expose an API so other components can invoke methods.
2. Direct access to the database (Architectural TD). Each component communicates directly with the database. This makes the data fragile as invalid data can be introduced from any point. Moreover, any change in the database schema may break the dependent components.
3. Lack of tests (Test TD). There are no automated tests (unit/integration tests) for individual components. As a result, any change in the imported data affects the crawlers and there is no way to validate that new implementations will not alter existing functionality.

Do the managers and developers perceive the problems as a form of TD?
Yes they do. Actually there is no commonly agreed definition of TD in the platform, however everyone realizes the difficulty of making changes to the code and how fragile it is. All developers understand that the platform needs some kind of refactoring.

Are managers and others willing to deal with technical debt?
Not surprisingly, the main concern of managers is how fast a novel feature will be in production. Nevertheless, management clearly sees that by ‘quick and dirty’ implementation without proper design and planning, bugs show up at increasing levels. Clients’ disappointment, which is often clearly communicated to the company, is also a direct consequence. When symptoms of technical debt are discussed, management seems to perfectly understand it; however they are somehow unwilling to make extensive changes to the platform, claiming that they do not have an accurate estimate for the required effort.

Is TD getting bigger over time?
Definitely yes. Supporting new retailers requires the addition of new crawlers. In the existing design, the higher the number of retailers (i.e. the bigger the business is) the harder it is to make changes to the product matching services (all crawlers must change).

Is it difficult to determine the effort to deal with technical debt?
It’s a kind of a vicious circle: The lack of boundaries between different components of the platform renders any effort to identify and measure technical debt almost impossible. However, without convincingly accurate estimates any plans for drastic changes, such as proper modularization, are abandoned.

What is the main reason for the accumulation of TD?
The platform started as a trial product with a poor initial design. Continuous pressure for the incorporation of new features and a marketing approach claiming an “instant- development time” model led to the accumulation of TD in all components. Moreover, a lack of processes especially during the initial stages resulted in incompletely specified requirements, poor architectural documentation and inefficient testing.

2. Are TD estimates accurate? (Academic Experience)
Context: Two CRUD web applications with state-of-the-art technology have been developed (Java-based enterprise applications, one with the Spring Web MVC and one with the Apache Struts 2 framework). Systems have evolved over successive ‘versions’ by gradually adding features. System was relatively small (~36 classes). Development time was approximately 25 man-days for each application.
Goal: a) To measure TD resulting by following strict programming practices imposed by the employed frameworks on typical Web applications. b) To investigate whether TD increases with the passage of versions.
Means: TD measured in both applications by SonarQube platform according to the SQALE methodology. Type of suggested refactorings and actual time to resolve reported issues have been recorded.
Spring-based system – TD: 1d 4h (only Major, Minor, Info issues) -> Actual time to resolve: less than 2 hours
Struts-based system – TD: 1d 6h (only Major, Minor, Info issues) -> Actual time to resolve: less than 2 hours

• Framework-based development does not lead to blocker/critical issues (limited TD)
• No tremendous improvement achieved by repaying TD
• Required effort was significantly lower than estimates

3. Estimating Technical Debt (our vision)
We believe that assessing software quality by universal thresholds is risky and flawed: for example, prohibitively high complexity in one domain (e.g. Information Management Systems) might be completely reasonable in another domain (e.g. image processing).
TD should be assessed realistically, that is against the potentially achievable levels of TD for each system under study. Let us assume any model where TD is assessed by a function that considers various parameters of interest. Such a quantifiable measure can serve as a fitness function to drive an optimization approach (Harman & Clark, 2004), that is, a process of obtaining the design/system which optimizes the selected fitness function.
Having the fitness function value for the ‘actual’ and the ‘optimum’ system one can determine the distance between the two. This is a single figure which can quantify the principal of TD. Under certain conditions, one can determine the required actions (refactoring effort) to move the actual system to the optimum position, i.e. to cover the entire or part of the distance.


Benefits from this perspective (measuring TD as distance):
• TD is realistic: it is assessed against the potentially achievable quality for any system depending on its characteristics
• TD can be quantified by automated and thus consistent means
• TD principal can be mapped to actual refactoring activities required to cover the distance
• Based on historical data, one can reason about the benefit of addressing TD and find out how much maintenance effort is saved
• It becomes possible to assess not only negative but also positive contribution to TD, i.e. actions that reduce TD

Harman, M., & Clark, J. (2004). Metrics are fitness functions too. In 10th International Symposium on Software Metrics. Proceedings, pp. 58–69. http://doi.org/10.1109/METRIC.2004.1357891
Li, Z., Avgeriou, P., & Liang, P. (2015). A systematic mapping study on technical debt and its management. Journal of Systems and Software, vol. 101, March 2015, pp. 193-220.