Process

The Firebreak

The website is live. Now would be an excellent time not to immediately break it again.

Go-live day has a very specific energy. There's the final pre-launch checks. DNS propagating, smoke tests passing, someone refreshing the homepage on their phone like it might disappear if they stop looking. Then there's the moment the client says "it's live" in Slack, followed by the gif with the champagne. Everyone types "🎉" in response. You close seventeen browser tabs.

And then, almost immediately, someone opens a ticket.

It starts with the low-grade stuff. A padding issue on mobile that nobody caught in staging because the test devices were wrong. A redirect that was supposed to be a 301 but is doing something weird in Safari. A form confirmation email landing in junk because the SPF record wasn't quite right. Nothing catastrophic. Just the natural static that comes with deploying anything to the actual internet with actual users.

The instinct, in most teams, is to triage it all, close the sprint, and start the next one. There's a backlog. There are features that were descoped to hit the launch date. The client is energised and has Ideas. Momentum feels like the right thing to maintain.

This is where a lot of teams quietly make a mistake.

Foresters have a concept called a firebreak. A strip of land that's been deliberately cleared of vegetation to slow or stop the spread of a wildfire. You don't cut it during the fire. You cut it in advance, during a calm period, because you know the fire is coming and you want to contain the damage when it does.

In software, a firebreak is borrowed from the same idea. It's a short, deliberately protected period after a major release where the team stops building new things and instead deals with everything that the release exposed, deferred, or broke. Not sprint zero. Not a bug bash. A structured pause with a clear purpose: get the new thing properly stable before you start piling new weight on top of it.

The typical firebreak runs a week, sometimes two. Occasionally just three or four days is enough. The length is almost less important than the protection. It only works if the stakeholders agree that no new feature work starts during this window. That part requires a conversation before launch, not after. "After we go live, we're going to take a week to stabilise before we start the next phase" is a much easier sell than "we need to pause the roadmap because things aren't quite right." The first is professionalism. The second sounds like an apology.

What actually happens during a firebreak depends on what the launch revealed, but there are patterns.

The monitoring always needs attention. You had alerting in place, you think, but now you're watching real traffic and the dashboards look different to how they looked in staging. Error rates that seemed fine at low volume need different thresholds at scale. Slow queries that you knew about but didn't prioritise are now visibly dragging the 95th percentile. You spend time on this because fixing a performance problem you can see is significantly easier than diagnosing one in the early hours of the morning six weeks later.

The documentation that was going to be written "after launch" gets written. App passports, deployment runbooks, environment variable inventories, the things that live in someone's head because there was never a good time to get them out. There is now a good time. Use it, because that person is going to leave eventually (everyone does), and "I think James set that up" is not an operational strategy.

Accessibility issues get a proper pass. In the crush to ship, the audit findings from three weeks ago got split into "critical" and "nice to have," and the nice-to-haves are sitting in the backlog. During a firebreak you make actual progress on them rather than letting them drift. This is the right thing to do ethically. It's also the right thing to do because accessibility debt compounds in the same way technical debt does, and nobody enjoys an enforcement notice.

The team holds a proper retrospective, not the fifteen-minute end-of-sprint version but a genuine post-launch review. What worked. What nearly killed us. What we promised ourselves we'd never do again and then did anyway. These conversations are worth having while the launch is fresh and before everyone has moved on to the next thing. The lessons that get captured in the week after a launch are significantly more honest than the ones captured a month later, when the rough edges of memory have been sanded smooth by distance.

Security gets a look. Not a full penetration test necessarily, but someone actually checking whether the production environment has hardened headers, whether debug mode is genuinely off, whether the admin endpoints are properly restricted. Launch day is not the moment for this. The firebreak is.

It's important to acknowledge that a firebreak isn't rest. The name makes it sound passive. A pause. A gap. But the work is real and it matters. The difference is that you're not building forward; you're consolidating what you have.

In forestry, a firebreak doesn't stop fires happening. It controls where they can go. It limits the damage when things go wrong. And things go wrong, in forests and in production environments, with depressing regularity.

What you're really doing during a firebreak is buying down risk before you start accumulating more of it. Every new feature you add to an unstable foundation makes the foundation harder to fix. Every week that passes without documentation is a week of institutional memory at risk. Every unresolved performance issue is a potential incident waiting for the right traffic spike to mature.

The confetti has landed. The client is happy. The team deserves a moment to breathe, and then a moment to tidy.

Do the firebreak. Cut the break before the fire, not during it.

The Firebreak

Read next

The Bus Factor

Your App Has No ID. And That's How It Gets Away With Murder.