Community-Based Repository of Tools to Support Empirical Research

As with much of the rest of software engineering, the current state-of-the-art research in the area of architectural technical debt and software architecture in a broader sense are impeded by the myriad disjoint research and development environments. The resulting “one off” solutions inhibit further advances, and make it difficult to systematically synthesize novel research techniques on top of existing ones and to cross-validate those techniques. As a result, researchers and practitioners needing to build cutting-edge architecture-based tools must often create their building blocks (e.g., software components and frameworks) from scratch. In doing so, they tend to unnecessarily repeat each other’s efforts and even to revert to solutions that had already been tried and established as ineffective.

We have identified the following five key challenges which are faced by the software engineering community, when conducting research in the area of technical debt as well as software architecture:

1 – MTD Tool Accessibility and Reusability of Research Techniques. Implementations of research techniques and tools are often not easily accessible, unavailable, defective, or are no longer supported by their original creators. For tools that do work, it is common for them to not operate as advertised, resulting in major effort required to adapt these tools for further MTD research.

2 – Lack of Benchmarks and Datasets. Access to and construction of public artifacts, case studies, and benchmark datasets are challenges shared across the field of software engineering. For the research in the area of technical debt, these challenges are particularly pronounced due to the fact that many of the factors contributing to technical debt (e.g design decisions) have a tendency to be undocumented. For instance software architecture artifacts embody significant amounts of expert knowledge. In practice, developing such artifacts is expensive, organizations are reluctant to share such artifacts, and monopolizing architectural knowledge may create a perception of job security, further disincentivizing the construction and maintenance of such artifacts. As a re sult, the research community often relies on small datasets, lacking the domain- or application-specific knowledge needed to reconstruct accurately most factors contributing or impacting technical debt. It is thus necessary to develop an instrument that can aid in the generation, storage, and sharing of such artifacts by reverse engineering existing large-scale open-source systems. In earlier work, we examined the practicality of using various reverse-engineering approaches to obtain ground-truth architectures to partly address this challenge.

3 – Interoperability of Tools. Technical Debt research is hampered by distributed research environments and stove-piped solutions emerging from different research groups. This, in turn, inhibits research advances, makes it difficult to synthesize techniques and tools in new and exciting ways, and complicates comparisons of research solutions. Researchers and practitioners in need of cutting-edge technical debt analysis must often recreate tools or their major elements, including basic code analysis, reverse-engineering functions, and frameworks. Furthermore, different assumptions that these tools make (e.g., about the execution environments, formats used, implementation languages, etc.) prevent their combined use, further inhibiting breakthroughs.

4 – Reproducibility of Experiments and Analyses. Due to inaccessible, non-reusable, or defective tools, datasets, and case studies, and incompatible underlying tool assumptions, it is difficult to reproduce the results of many previous software architecture-oriented and technical debt research studies. In software engineering research, reproducibility is often rendered too difficult or impossible, even for studies designed to be repeatable.

5 – Technology Transfer. Despite the fact that practitioners understand, appreciate, and emphasize the criticality of software architecture as well as managing technical debt in the success of software systems, technology transfer in this area is hindered by the fact that most prototype tools are not sufficiently mature to support production-level or industrial usage. An overwhelming majority of software- engineering research groups lack the resources (e.g., personnel and hardware) needed to build tools that are robust and scalable enough to be easily and effectively used by other researchers, let alone by industry-grade software projects.

I would like to acknowledge my collaborators on this project, Nenad Medvidovick, Sam Malek, Josh Garcia for their contribution in summarizing the community-based challenges and formulating solutions.