Of all the weird and wonderful analogies to building software that I’ve heard, the one that sticks with me is building a house. No analogy is perfect, and I’d love to see the Pub/Sub aisle at Home Depot, but it works well enough. The general contractor picks the right tools and frameworks for a solid foundation, skilled craftsman frame the necessary modules and services, some great trades wire and pipe everything together, and then a designer works to make it look and feel like home. At this point the projects are usually well scoped, and even if most projects feel agile, they’re still waterfall-ish in nature. The goal is still to have a house at the end of those iterations. What interests me is what comes after.
Technical debt is a concept that is often cited and often poorly understood. Many definitions try to oversimplify the meaning of technical debt, limiting to just a handful of causes. However the reality is technical debt accumulates even while doing nothing. Frameworks age and fall out of popularity, industry-standard techniques are replaced or refined, deficits in the original design are discovered, the list goes on. On top of general age and rot, poor communication and planning can leave systems tightly coupled, and not up to standards. Time pressure only adds to the list, including cutting corners, writing poor or limited documentation, and skipping testing. All of these in various capacities contribute to the accumulation of debt over time.
How do you combat this kind of entropic chaos? Some lesser experienced technical leaders might try to pay debt all at once, or even worse not at all. Some project leads call for extreme system decomposition into dozens of microservices. The idea is that each one can be independently rewritten over time, but this introduces a whole host of new engineering complexity. Some try to intersperse improvements as independent tasks, but then which debts get paid and which continue to acrue interest? The truth lies somewhere in the midst of all of these options.
Leaning back into the house analogy, I think there is some good concepts to borrow from the general contracting industry. There are two basic principles that I have seen good builders apply over the years: structure first, and bring things up to code. The first concept is easy - don’t invest in anything relating to finishing until the structure is sound. This means that foundational upgrades such as critical security issues, framework upgrades, and tooling changes should be addressed first. Otherwise any other debt or even improvement tasks that get completed are just begging to be redone. The second concept is driven from a regulatory perspective, but I think applies as well. When a contractor opens up a wall, they need to bring whatever is behind that wall up to code before the inspector signs off on the finished changes. As a result, the estimated cost of bringing things up to code is budgeted as part of the renovation itself. When dealing with a reasonably well built newer house, the renovation cost is likely going to be low. When renovating an ancient DIY house, the contractor is going to be pulling asbestos and lead out of the walls, so it gets planned for with a contingency.
No reasonable individual or contractor would spend thousands of dollars renovating a kitchen in a house with no roof, so why invest in a new jQuery-based theme when a project goal may be to migrate to React or Svelte? Similar to the concept of bringing things up to code, I feel the key to successful technical debt management is to be continuously tackling technical debt. Larger foundational items such as changes in frameworks, languages, and tools should be discussed as major projects, but upgrading and patching should part of a healthy development and maintenance lifecycle. When a service has some major rennovation going on, such as adding new features, the developer should take the time to bring things around that new feature up to code. Migrating from class components to functional components in React? Rarely will it make sense to do that migration all at once, but while a developer is already rewriting and redefining classes in a service? It makes sense to bring that up to code before shipping it off.
I’m not advocating for free-for-all rewrites, every developer for themselves. What I am saying is every feature, every major bug, should take the time to consider current best practices and team standards. If the team decides that classes are just fine and never decide to adopt functional components, that’s not debt, that’s a decision. But a developer going into a module finding a major defect like unsanitized inputs, or a straightforward performance improvement should be empowered to bring the code up to current standards, and document the debt that they paid while working on the original feature.
For me, a great feature of this is the code base is continually improving, no single developer feels burdened doing nothing but paying technical debt, and the cost is negligible. Chances are your developers are paying in some form for debt right now - either quietly making improvements, or being burdened with the need to work around debt, potentially incurring even more debt. When planning we typically ask for an estimate for the task, which often gets padded out quietly. Instead, explicitly build in a contingency for bringing things up to code. An additional advantage is over time, the effort for a task versus the debt paid during that task can be monitored. If the debt ratio is rising, it can be an early indicator that it may be time to look at structual or systemic issues.
In short, these house renovation analogies can fit the software lifecycle well. It offers developers a deeper level of ownership, and it offers an early warning system for uncontrolled debt growth. A software system, much like a house, needs regular and thorough maintenance on its components. Budgeting for these issues, and correcting them early, ensures that the code base remains in peak performance.