Picture this scenario. You're creating a software system and you need to make a choice. You know that you could write a piece of code and get it done by the end of the day, but that would result in a sloppy implementation. You know if you spend another day that you could write a better version. However, you have a deadline looming. You decide to write the sloppy version of the code and fix it later.
Engineers use the metaphor of "debt" to describe this scenario because you are telling yourself you will "pay it back" by fixing it later. In this scenario, it would take another day to write the better version of the code, so you incur a day of debt. While it's measured in time, it's more accurate to say what we are really measuring is the effort it would take. Effort is the currency of technical debt.
Technical Debt Definition, V1: Technical debt is the effort it would take to undo shortcuts we have taken.
This sounds like a decent definition. If you're an engineer, you may have heard a definition like this before. Is it accurate though?
Technical debt isn't always a deliberate choice. In the scenario I described above, you thought you could write a better version of some piece of code if given one more day. What if you wrote a piece of code and only discovered later that there was a better way? The outcome is the same. In both cases, you have the same result. You know that with one day of effort, you could write a better version of the code. It wasn't intentional.
There are other ways we can get ourselves into debt as well. So far, we have defined "deliberate debt" and "accidental debt."
Just like it's metaphorical counterpart, you can also inherit debt. You can take ownership of a system and find debt that someone else created.
Another type of technical debt occurs when a system was well-designed but something changes. In this case, you may have done something the best way you knew how or the best way anyone knew, but technology evolves and a new technique emerged. That results in the same type of gap created by a rush job or not knowing the optimal implementation.
What if a system was well-defined for a given circumstance, but something about that circumstance changed? Let's say you designed a system meant to handle 100 users, but now it must handle 100,000 users. The decisions you made could be completely altered by scale. If you are an engineer, just to give you a more specific example, you may think about the difference between caching something in application context, in-memory, or externalized based on how much needs to be cached.
I'm sure you can think of even more examples. Here are a few I've received:
- How testable is your system? Is it difficult to execute some test scenarios? Do you struggle to reproduce production issues in other environments? You are in technical debt.
- Do you spend a lot of time on manual, repetitive tasks that could be automated? You are in technical debt.
A more encompassing definition of tech debt
Let's update our definition to be more agnostic to the type of debt.
Technical Debt Definition V2: Technical debt is the gap, measured in effort, between our current code and its ideal state.
Maybe you've already noticed another issue. So far, we've talked about code, but technical debt isn't limited to just code. We make decisions about infrastructure and design as well. Imagine a scenario where we use an outdated component for expedience.
Another type of debt relates to documentation. Any time you skip documentation because that choice is easier in the short term even though it may help in the long term—that is technical debt.
Missing controls, alerts, and monitoring can also be technical debt.
So, let's not limit ourselves to code and broaden our definition further.
Technical Debt Definition V3: Technical debt is the gap, measured in effort, between our current system and its ideal state.
How much technical debt is too much?
The definition above begs a question. What is ideal?
I've heard writers say that you never finish a book, you just decide (eventually) that it's good enough to show other people. Likewise, we could spend a lot of time optimizing code, but we'll eventually run into diminishing returns. The effort to optimize will outweigh the benefits. This means some small amount of technical debt is okay. It can even be okay to add technical debt temporarily.
To answer 'what is ideal,' consider what problems technical debt causes in the first place.
Project managers sometimes break down concerns into three categories: cost, time (or effort), and quality. Technical debt impacts all of these. The primary impact comes from complexity in software and architecture that reduces the efficiency of introducing new changes which increases the effort it takes to make enhancements.
Most engineers have run into a situation like this: you need to make a change but the existing code is difficult to understand, documentation is lacking, or technology choices limit your ability to use new techniques. Making a change on an outdated platform can cost time as engineers have to learn or refresh outdated skillsets. Monitoring and controls are more cumbersome to implement on an outdated system.
Systems with high technical debt are more prone to defects and errors. More person hours may be spent on defect resolution or system maintenance.
Technical debt also introduces risk. If platforms and libraries aren't kept up-to-date, security vulnerabilities can emerge. The inability to leverage the latest monitoring patterns results in observability gaps, preventing a proactive response.
A less obvious impact is the effect on engineers. Talented engineers don't like working on debt-ridden systems. This impacts a company's ability to recruit and retain a skilled workforce. Ultimately, technical debt is risky and has a negative impact on delivery lead time, quality, and cost, both directly and indirectly.
What to do about technical debt?
Let’s look at the three steps you can take to address technical debt.
1. Continually evaluate the effort to repay debt
It's common to suggest a percentage of capacity be allocated to address technical debt (the bucket approach), and sometimes this works. A situation where this might be okay is a team focused on greenfield work (brand new development, without having to maintain existing systems).
However, my advice would be to continually evaluate the amount of effort spent on "paying back" technical debt based on the prevalence of the problems that the debt causes.
Using the financial metaphor, this is just like saying that being deeper in debt should result in more time and effort to get out of debt. Think of it the other way around. Would it make sense to advise someone to spend 10% of their income on paying down financial debt no matter how much debt they had? For some, that would be too much. For others, that would be too little. We need to calibrate paying technical debt in the same way.
I'm not going to give an oversimplified percentage. Rather, try to think of a spectrum along these lines:
|Impact on cost, quality, effort, or risk||How much time should be spent paying down debt?|
|Severe||Majority of effort|
|Very Low||Little to no effort; focus on not adding new debt|
2. Review metrics to determine the impact level of paying down debt
You can look at metrics. A fairly industry-standard way to measure delivery efficiency is DORA's four key metrics. The question, though, is how much technical debt is impacting those metrics.
Here is something you can try: ask your engineers to rate how much debt is impacting these outcomes. Make sure they consider all types of debt and the impact on each of these categories. A simple method for this would be to ask engineers to rate each outcome on a scale of 1 to 10, where 1 is no impact from technical debt and 10 is a high impact from technical debt. Use that ranking to start a conversation about the right amount of effort. Thos discussion can result in further revelations. Repeat this exercise on a regular basis. This is just one method to approach paying down debt. You could also do a likert survey, or have technical leaders (like architects and tech anchors) lead a conversation on the topic. The most important things are continuous evaluation and open discussion.
3. Track your technical debt
To determine your impact level, you must keep an account of all your debt. It's easy to forget particular items if they aren't tracked. While there are always some unknowns around technical debt, you can only quantify what you do know if you are tracking it. The sum of effort hours can be a good high-level way to gauge the amount of technical debt a given system has. For example, if you identify one issue that would take two days to resolve, and another that would take three days to resolve, you can say you have identified five days of technical debt.
Tracking technical debt allows you to prioritize. You can elevate the most impactful debt to be addressed sooner. Going back to our financial metaphor again, this is like prioritizing paying down high-interest debt before low-interest debt. You get more value from addressing the most impactful debt.
I'm an advocate for tracking these items as close to code as possible. For example, if you use GitHub, technical debt can be tracked as Git issues. This ensures there is a connection between the code and debt tracking. If a system changes hands, nothing is lost.
There are also tools available that automatically detect technical debt and track it. It's important to note these tools can't detect all types of debt, but they are quite valuable for detecting outdated libraries, common anti-patterns in code, and some other issues. There are tools built into IDEs that analyze your code for the same patterns.
4. Avoid technical debt to begin with
Technical debt has one more thing in common with financial debt. It's best avoided in the first place. The effort to pay down technical debt is almost always greater than the effort to avoid creating it. This is more difficult in some situations than others. Inherited debt is difficult to influence. However, taking the extra time to implement a better solution (when possible) can avoid the negative impacts and eventual cost to repay.
Debt can be mitigated using advanced software engineering practices. Pair programming can help avoid situations where debt is created unintentionally. Two sets of eyes can catch issues that one would miss. Test-driven development, where tests are created prior to code, can avoid debt caused by a lack of automated testing. Engineers should be encouraged to adopt a mindset of refactoring and iterative improvement to reduce debt sooner.
Product-oriented teams gain expertise in the systems they own. That familiarity will help avoid debt simply because it empowers engineers to make more informed choices. Ownership also provides a sense of investment. When a team owns a system, they take more care of that system. Have you ever heard someone describe a system or project as their "baby?" That should tell you how powerful this sentiment can be.
Avoid adding technical debt to a system whenever possible. The build up of technical debt can impede delivery, reduce quality, and introduce unnecessary risk to a system. This isn't limited to just code. Continuously keep track of technical debt and evaluate how it is impacting these outcomes on your team. Adjust the amount of time you spend on paying down technical debt with the goal of avoiding these outcomes.