WWCode Talks Tech #11: Digging Ourselves Out of Code Debt
Written by WWCode HQ
Rachel Church Senior Software Engineer at Airtable shares “Digging Ourselves Out of Code Debt.” She explains code debt, the impact that it has on long-term productivity, and the importance of addressing and eliminating it.
I want to disambiguate code debt a bit, show some research around it, and discuss some ways to dig ourselves out of it. There is a three-step program to address code debt. The first step is raising awareness of what code debt is, how it happens, and why it's important. The second step is identifying code debt within your own code base. This step is much harder, but we have 30 years' worth of research and books we can lean on. The third step is to manage the debt explicitly by tracking debt-related tasks and addressing issues as part of the normal development cycle. This is not a one-time thing, addressing code debt is ongoing and needs to become part of the company culture. It's like if you had a garden, and you let the weeds take over to the point that it became a big problem. It'd be very difficult to have a healthy garden. We want to tend the garden daily or weekly, addressing the weeds is a part of the process.
The debt metaphor was introduced in 1992 by Ward Cunningham, who is the co-author of the Agile Manifesto. It's amazing that this software engineering term was introduced 30 years ago, but it's something we're still researching and figuring out today. Cunningham stated that although introducing not quite the right code can speed up development, it's just like a loan, and incurs interest over time in the form of more bugs, more time fixing obscure issues, and more time required to comprehend the code. The way to pay off the debt is by refactoring the code to be in a better state.
All code is some liability and has a cost associated with it. Code will always require some level of maintenance, testing, and documentation. When I say code debt, I'm specifically referring to the additional efforts required to accomplish that same task. If you're upgrading a package that is deprecated, that's not code debt. That is normal maintenance, working a repo. If the code is structured poorly and it requires two weeks instead of one to upgrade the package, then that is the interest payment of the code debt. At some point, a developer had a short-term need that drove them to structure the code in one way that made it more difficult to upgrade that package later. This had some long-term costs which we referred to as code debt.
In 2018, Stripe surveyed more than 1000 developers and 1000 C-level executives and found on average about 33% of engineering time is being spent addressing technical debt around the world. One-third of all developer time is reportedly spent trudging through difficult-to-work with code. In the same survey, senior executives reported that one of the biggest financial threats to their business was a lack of developer talent. The report argued that the real problem may not actually be the number of developers available, but rather how productive those developers are able to be with the time that they have. The developers who responded to this survey were far more aware of the additional time being spent on technical debt than leadership. This indicates that companies may just be blissfully unaware of how much technical debt is really costing them.
Code debt affects the entire company. It's easier to address when everyone is aware that it exists and that it's really a problem. If leadership thinks the problem is a lack of developers, they're just going to keep hiring. It doesn't really solve the problem that code debt is slowing down development and preventing developers from producing their best work. The other cost is motivation and that feeling of dread you get when you're working with code debt. The additional effort needed to complete tasks slowly chips away a developer's motivation and morale. Code debt makes it harder to produce high-quality software. Code debt has been shown in research to reduce employee motivation, which has long-term ripple effects throughout the code base, even when the developers are not directly working with code that has code debt inside of it. In the same study, they found the reverse to be true as well. The process of managing code debt as part of the normal development process doesn't just prevent morale loss, it actually boosts it. Removing code debt brings a sense of progress, continuous improvement, and shared learnings across the team. It increases motivation. The right culture and code debt preventative mechanisms reinforce one another. Less time wasted on code debt increases productivity and morale which then feeds back into a healthy culture.
In 2015, Michael Tufano and Fabio Palomba conducted an empirical study on the change history of 200 Open Source projects to answer the question of Why and When code debt is created. There are common assumptions that we make about when code debt is introduced. The first one is that code debt is introduced slowly over time as the code base grows. This is false. Developers have a tendency to believe that writing new code from scratch produces clean code and it's the other developers that come in later and pile code on top that introduces code debt. Researchers found that code debt's actually more likely to be introduced when writing code from scratch. When there's a clean slate, the author has more flexibility in how the code is architected, which presents more opportunities for design problems.
The second assumption is that code debt's more likely to be introduced right before a deadline. This one is true. Researchers found evidence that in open source projects when developers have a higher workload, they'll more likely introduce code debt compared to developers with a lower workload. Between these two first assumptions, we know that proper planning and accounting for the time to meet a deadline is critical to preventing code debt.
Our last assumption is that developers who are new to a codebase are more likely to create code debt than developers who have experience working in the code base. Researchers found that newer developers are actually less likely to create code debt. This may be because experienced developers tend to perform more complex and critical tasks, which would make their commits more prone to introducing design problems. This means that it is everyone's responsibility. It doesn't matter if you're experienced or earlier career, we're all in this mess together.
What happens after code debt is already introduced in the repo? The research seems to suggest that poorly designed code persists in a code base for a long time after being introduced. A 2010 study of the change history of two open source projects found evidence to support that almost 90% of code debt remains in a project until the very end. This implies that engineers are not spending time removing code debt. The study found that code debt was often removed not by targeted refactoring, but as a side effect of other code changes. A 2017 study provides supporting evidence for this theory. In their survey, they found that less than 8% of engineers said that they addressed code debt during dedicated refactoring. A 2019 study found that a quarter of code debt is contagious. This means that it forces developers to add more code debt on top of that. This forms the snowball effect when code debt exists in a repo it has a tendency to start growing and accumulating more, until those design issues are addressed. Regardless of preventative measures, there needs to be some kind of explicit management of the debt after it's introduced, so it doesn't continue to grow out of hand.
Have you ever opened up a code file and started poking around and just have this feeling that something wasn't right? It could have been the structure or how many methods there are. You might not have been able to put your finger on what was wrong. Your intuition might be trying to tell you something. Code smells aren't inherently negative, they're just sniffable hints that there might be a problem. It could also mean that you just stumbled upon code in a style that you're not familiar with. If you were to investigate that code smell, you might be able to identify an underlying anti-pattern. An anti-pattern is a common pattern that is inefficient or counterproductive and proven to be bad. These two concepts often get overlapped and confused, but they both contribute to code debt.
I love code smells. Developers at all experience levels can detect them even if they don't have the knowledge to evaluate if they're a real problem or not. They spark discussion, investigation, and team learning. Regardless if this code smell turns out to be a real issue or an anti-pattern, or just maybe just a style of code you haven't seen before, the correct approach to detecting a code smell is curiosity. There's an explicit and identifiable list of name code smells. In their book on refactoring, Martin Fowler and Kent Beck introduced 22 distinct code smells. The developer community has continued to add a few code smells and group them, but the original 22 smells are very applicable today. Developers can and should use code smells as indicators to help trace code debt and detect issues before they snowball.
Research shows that code smells are a good indicator of re-factor opportunities. They have a 35% higher hazard rate than files with no code smells. The second step of addressing code debt is identifying the code debt itself within your repo. One objective way to identify it is with the use of code smells. If you're able to disambiguate what in the code could be improved and you could share documentation of research and how that could contribute to bugs, cognitive loads, productivity, etcetera. It would be much easier to have a discussion and prioritize that as a team without making anyone feel bad. For each of these code smells, you can look up to see what they are and why they become a problem. I really like the website refactoring.guru. It's based on Fowler's book on refactoring, and it provides very clear and concise descriptions of each of these code smells. When I ask which code smells are most concerning? Developers typically point at code smells that are related to size and complexity. I bet if you found a file in your code base that has an above-average number of lines, you could probably find an example of at least one of the bloater code smells.
The last code smell I want to talk about is comments. It's easy to recognize, but it's very counter-intuitive, misunderstood, and often ignored. Developers often overlook it, despite the fact that research has shown that it has a high correlation with bugs in code. Think about the last time you wrote a really nice detailed comment in your code. Why did you create that comment? Most likely, you've had really good intentions and you realized the code wasn't intuitive or obvious. You tried to improve the readability by leaving a comment to explain what was happening, or how to utilize the method. Comments can mask the smell of complex or low-quality code. This doesn't mean don't write comments, but comments should explain why something exists, not the what or how. Good code is really self-documented and good code is written in a way that is intended to be read by other humans, not just the computer.
Martin Fowler's book on refactoring is a recipe book. Refactoring is the process of changing code without changing its behavior. Martin Fowler's book includes different refactoring techniques you can do to fix code smells. Modern IDEs such as VS Code and IntelliJ have built-in tooling for some refactoring methods that will do a lot of that work for you. Martin Fowler's personal website refactoring.com will actually list all of the different refactoring techniques.
According to the Stripe survey, code debt and maintaining legacy systems are some of the main reasons that developer productivity is hindered. But of those who did the survey, developers, and executives, only a small percentage of companies even had a process in place for managing code debt. Just like financial debt, reducing code debt requires a plan and a schedule. It doesn't work if you try to do it all at once. If you know that code debt's going to occur naturally, then it makes sense that tracking it also needs to become a part of the natural process.
A 2020 study looked at the amount of time it took engineers to add a new feature and remove a bug inside of code that contained code debt. Then they compared that to the amount of time it took to perform the same exact task, except they refactored the code first to remove that code debt. They found on average it took less time to refactor the code first before adding the new feature or removing the bug on part of it. If there's a part of the code base that is stable and doesn't get touched very often, then there's no value in tossing that code debt immediately on top of your backlog. The best time to address code debt is before adding new features on top of it. What about code debt projects that are really large and take a lot of time to fix? When executives hear that the engineering team needs a month or more to refactor something, they might just see a big time sink that isn't being spent on feature development. You really do need high-level leadership support to put time into a larger refactoring effort.
You need to align the entire organization on the shared long-term value of that. It's an all-hands-on-deck problem, and this is explaining a culture shift. The process of aligning an entire team or an entire organization is shifting the culture. It's hard for leadership to understand what code debt means if they're not technical, but everyone can really understand what continuous product health is. At Airtable, we have something very similar. Airtable calls this engineering excellence, and this is our version of continuous product health. Everyone in the company is already oriented around building great products and improving the customer experience. We just need to draw that connection between code debt and how we meet these goals. We define success as engineers feeling empowered to invest time in engineering excellent work. Leadership understands the implications of code debt and the value in prioritizing it alongside feature development and another type of work.