A philosophy of technical debt management

Published in

YounitedTech

7 min readJun 24, 2020

“We don’t have time to refactor.”
“Our managers won’t let us implement automated tests!”
“This app is really old. No one knows how it works since Peter left the company 3 years ago, and people are scared to make changes to it.”
“This piece of software is buggy. We know it, but don’t have time to invest in it. So we just have a manual process to work around the bug when it happens.”
“Nope. I’m not working on that crappy piece of sh*t.”

It’s safe to say that every software engineer has heard one or more of these examples during his or her career. And most probably has already pronounced something approaching. We know we have. Numerous times.

Well, we decided to tackle this problem, head first. At Younited, we like to think of ourselves as a tech company for 2 reasons: we only do business online, and for the last 3 years, we have been selling software capacities to customers and partners. So with time, we have also become a software vendor.

Software gardening

In our job, today, software is not only a tool fueling our business and growth. It has actually become a financial asset we estimate and promote the value of. And a financial asset is definitely something you want to take care of.

Now and again, we stumble upon an analogy for software development cornered by Hunt and Thomas in the classic Pragmatic Programmer: gardening. Quite a few people have talked about this over the years, but the analogy never really made it to become mainstream. Still, it does have interesting food for thought. Including thoughts about dynamic context. Just like vegetals live in an ecosystem, software doesn’t live in a static world. Context evolves: business goals, dependencies, practices, tools, etc. With time, perfectly good decisions can become invalidated. And most importantly, the job is not over once the software is shipped. If you don’t look after a tree properly, it could bring down a wall. Whereas things could have been fine if you had maybe just pruned it.

In this article and the following, we’ll be talking about how we define debt, how we localize it and how we manage it. We’re used to having real — sometimes lively — discussions with our stakeholders and management. But for this topic, we really believed that we needed our top management onboard. Not only because all this is expensive, but also because our software isn’t just our Tech team’s concern. It’s a company-wide concern.

Debt is a company-wide concern

We strongly believe this. But, there are a few prerequisites which can be difficult to put into place. First and foremost, balanced dialog between the Tech team and its stakeholders must be part of the culture. Debt is a topic which must of course be discussed inside the tech teams. But not only. And definitely not something you want to shove under the carpet. This must be a primary concern during recruitment processes.

The next step is to make middle and top management aware of these topics. Hence these articles.

What do we call debt?

The term was coined by Ward Cunningham nearly 30 years ago now, as the XP movement was about to emerge. He came back on it in this short video in 2009:

What we call debt is the distance between the system we have and the expected system.

You definitely don’t want that arrow to be too wide

The expected system is the one we members of the Tech team believe is the best to fulfill today and tomorrow’s needs. Behind tomorrow’s needs are actually hidden two things:

First, the system we build today needs to continue to work tomorrow. This means that performance and maintainability don’t degrade with time.
Secondly, we don’t want each new feature we add to be more and more expensive to implement. This implies that we need to ensure that essential and accidental complexity are under control. We build for change.

Although we don’t have a crystal ball, we can make educated guesses on what those needs could be. It is crucial to keep available brain time to work on these assumptions. We like making forecasts based on the current trends and testing them by throwing in hypotheses. The important thing here is that the expected system is not just some mythical form of perfection from a nerdy dream. It’s simply the system we need.

Our debt takes multiple forms: code, architecture, automated testing, infrastructure, security.

Technical debt

We have purely technical debt. Sometimes it is the result of a decision to move faster. We prioritized time to market over perfection, which can be an excellent decision. Some other times, technical debt is the result of past good decisions made in a context that has since changed. This requires re-evaluation. And at last, as educated our guesses are, they still remain guesses. Sometimes we’re wrong. Leaps of faith don’t always lead to graceful landings, as we cannot guarantee that we always make the right decision. And that is why we take into account the cost of being wrong. For instance, how much would it cost us to rollback this shiny new piece of tech to something we have learned to master over the years? Besides, we believe that learning from our mistakes and adapting well is more important that thriving for unobtainable 100% straight A’s.

Functional debt

Functional debt is an interesting beast we don’t often talk about. Yet, it can slap you in the face really hard. Or bite your ankle, it depends. We have often fallen in the trap of the eternal MVP. We built a cool MVP, full of promises, that we quickly shipped into production to test an idea. Of course it was only the first step of a long and prosperous journey… only, we just moved on to something else. We have so much to do.

Another symptom is when you get better and better at building similar products. Each one is better than the previous, which is a good thing. BUT, no one wants to pay for previous ones, and you find yourself with several generations of similar components in your code base. Oh boy, have we suffered from this.

Side effects and our dearest friend Damocles

Now we’ve seen what we refer to as debt, let’s get an overview of what could happen if we didn’t don’t look after it.

Debt induces a higher level of risk. Every code change can introduce regressions or unexpected side effects.
Loss of efficiency: it gets more and more complicated and expensive to make the system run and evolve.
Missed opportunities: slowing the company down by stopping us signing deals or building better products than our competitors because we can’t adapt fast or well enough.
Technical breakdown: service interruption, bugs we can’t fix, features we can’t add.

The idea here isn’t to cry wolf and scare people for the sake of scaring them. These things happen all the time. In our previous experiences and simply by listening to what other folks have to say, we have seen what damage an over-indebted system can make. And this is not the kind of risk we want to take.

The physics of debt

It is important to learn to live with debt. Being in debt is normal. And fine too! Not being in debt could mean 3 things:

You’re not actually measuring it. Ignorance is bliss.
You’ve heavily over-invested in the past and have systematically chosen perfection over speed. But being fast and buying time is crucial for any company. It can mean reaching and testing the market before a competitor. Or testing a hypothesis before throwing yourself into a battle. This is a common point to most good tech companies today: they are fast.
Your domain is mature and stable. You don’t need to be fast anymore. Stability and predictability are more important than speed. This is far from being the case in our business. Of course, that doesn't mean that stability and predictability aren’t important. But speed is a very important parameter in all of our equations.

2. Debt grows. Debt being a distance between your current and expected systems, it is a natural phenomenon. On one hand, expectations tend to grow naturally: each step we take is a milestone on a path than can never end. On the other hand, software requires maintenance just to stay alive: updating librairies and frameworks, fixing bugs, tweaking performance, etc. Time works against you. If you don’t look after debt, debt will look after you. Just like in gardening, you have to pull out some weeds, prune trees, water your flowers when it’s hot, and so on.

3. An indebted system is costly. Bugs are more complicated to fix, evolutions require more time and higher skills. Of course Martin Fowler has a bliki page on this. And this can lead to sclerosis. At some point, the system becomes too complex and risky to change, and this is when we see these massive system redesign projects pop up. At this point, you’re paying more interest on your debt than actual capital. A refactoring a day keeps the redesign away.

4. If you start rewriting a part of your system, don’t stop half way. During this transition period, you’ll be paying for 2 systems that work in parallel PLUS in most cases a complex mechanism to synchronize both. This does not mean you shouldn’t do this rewriting. It simply means that you should migrate over a short period of time between the source and the target.

So this is what we call debt. And I hope you got a feeling of the philosophy of the way we tackle the problem. In the next article, we’ll deep dive into the methodology we used to estimate costs and how we intend to reduce our debt to an optimal point. And we got our top management onboard.