The Reluctant Guide to Shopify Migrations

Act VI: The Other Shore

Cutover

The Sunday before launch, I am making lemon pasta. I have made this pasta most Sundays for years—a lemon, a little of its zest, butter going to silk in the pan, the starchy water doing the emulsifying that the cookbooks make sound harder than it is. It is the most ordinary thing I do all week. Tonight I am making it slowly. I am paying attention to it the way you pay attention to a thing when the rest of your attention is somewhere it cannot help. There is a glass of wine poured on the counter. I poured it out of habit and then looked at it and understood I would not be drinking it tonight, and I leave it there anyway, because pouring it out feels like an admission and drinking it feels like a worse one. The runbook is next to the stove. It is forty pages. I printed it because I wanted to hold it, which is not a reason I would say out loud to anyone on the team.

I know the runbook. I wrote half of it and reviewed the other half twice. I am not reading it tonight to learn anything. I am keeping it near the food because tomorrow a very large number of things I have rehearsed will happen in an order I have rehearsed, and some unrehearsed thing will happen too, and the lemon pasta is the last fully knowable event between now and the other side of that. So I let it take longer than it needs to. The pasta is not relevant. That is exactly why it is in this chapter.

A single full wine glass resting on a bare kitchen counter at night, untouched, with condensation on the outside and a warm glow from off-frame illuminating it from the side.
Poured out of habit. Left there out of honesty.

A cutover is the most rehearsed moment in an entire migration and still the one most likely to surprise you, and those two facts are not in tension—they are the same fact wearing two coats. You rehearse it precisely because you know it will surprise you. The rehearsal is not how you eliminate the surprise. The rehearsal is how you earn the right to be surprised by something small instead of something that ends the launch. The teams that land a cutover cleanly are almost never the ones with the most sophisticated tooling, the prettiest dashboards, the automated everything. They are the ones who treated the rehearsal as the actual work and not as a box, and who walked in having already made peace with the part nobody likes to say out loud: that rollback, in practice, is usually a story you tell yourself rather than a thing you can do.

So let us take the rehearsal seriously first, because it is the part with the most leverage and the least glamour.


Run the cutover end to end, in staging, before you run it for real. Then run it again. Not the happy path—the whole path, in sequence, on the clock, with the people who will actually be awake doing the things they will actually do, reading from the runbook they will actually hold. A dress rehearsal that only proves the steps you were already confident about has told you nothing you didn’t know. The dress rehearsal that earns its keep is the one that breaks. The DNS change that propagates slower than the spreadsheet promised. The data export that takes ninety minutes when the plan budgeted forty-five. The third-party app that needs to be reauthorized against the new store and chooses the rehearsal to discover that the OAuth flow throws an error nobody has seen since the proof of concept. You want those discoveries on a Tuesday afternoon two weeks out, with coffee and no customers, and not at hour three of the real thing with the war room quiet in the particular way a war room gets quiet when something is wrong and nobody has named it yet.

There is a reason it has to be the whole path and not a sampled subset, and it is the same reason the previous chapter spent so long on: staging is lying to you about something, and you do not know what. A full rehearsal is the only honest test of how much it is lying. If the rehearsal cutover produces a store that behaves, you have evidence. If you only rehearse the steps you understood, you have rehearsed your understanding, which was never the thing at risk.


Now the part that should be printed in a slightly larger font and is not, because the production budget for this book is a fiction. Rollback is often not real.

The plan will have a rollback section. It should. Write it, version it, keep it. But be honest about what it actually buys you, because the word rollback carries a promise it usually cannot keep, and the promise is roughly this: if it goes wrong, we put it back the way it was. The trouble is the clock. The moment the new store starts taking real orders—the first customer who checks out on the new platform—the two systems have diverged, and they keep diverging every minute you stay live. The old system does not know about that order. It does not know about the inventory that order decremented, the loyalty points it accrued, the subscription it created, the address it validated, the fraud check it passed. Roll back to the old system an hour later and you are not restoring a clean state. You are abandoning an hour of real commerce that exists only on the system you just turned off, and now you have to reconcile it by hand, under pressure, while the thing that made you want to roll back in the first place is still on fire.

This is the catechism the book keeps returning to, wearing yet another costume: migrate the assumptions, not just the code. The rollback plan migrates an assumption—that the old state and the new state are two save points you can flip between—and that assumption was true right up until the first order, and false ever after. So plan as though you cannot roll back. Not because rollback is literally impossible in every case; for some narrow failures, caught in the first minutes before real volume, it genuinely is your move, and you should keep it sharp. But treat it as a fire extinguisher, not a parachute. A parachute is your plan for the fall. A fire extinguisher is for the small fire you catch early, and everyone in the room knows that past a certain size you are not extinguishing anything—you are evacuating. Do not say impossible when you mean ruinously expensive and slow, and do not say we can always roll back when you mean we can, for about eleven minutes, after which the word stops meaning anything. The teams that get burned are the ones who let rollback sit in the plan as a comfort object, unexamined, so that nobody priced what it would actually cost to use it until the moment they reached for it and found it heavier than they remembered.

A red fire extinguisher mounted on a blank wall with its safety pin intact, and a parachute harness in a pile on the floor below it, the two objects side by side but not touching.
One of these is for the fall. One is for the small fire you catch early. The plan usually confuses them.

Which brings us to the runbook, and to why a person prints forty pages and keeps them by the stove.

The runbook is the artifact, and its strange property is that it exists for a moment when no one will read it. On the day, you will not be flipping to page thirty-one to find out what comes next; you will know what comes next, because you rehearsed it twice and you have lived inside this sequence for weeks. The runbook is not a set of instructions for the people running the cutover. It is a promise to the people who are not. It is for the person who wakes at three in the morning two days before launch with their heart going, certain that something has been forgotten, and who needs—not wants, needs—to be able to open a document and see that somebody wrote it down. Every step. Who does it, when, what done looks like, what to do if it isn’t. The runbook converts a private terror into a shared object you can point at. Half its value is delivered before launch day, in the sleep it gives back to the people who would otherwise lie awake auditing the plan in their heads. Write it for them. Write it as though the person reading it is frightened and competent and alone at three in the morning, because at some point that person is real, and on a good migration that person is you.


On the day itself, the discipline is communication, and the rule is smaller and stranger than people expect. One channel. One. Not a channel for engineering and a channel for stakeholders and a side thread for the vendor and a DM chain that three people are quietly running in parallel—one channel, where everything is posted, and a single person whose only job is to post in it. And the cadence: every fifteen minutes, on the clock, whether or not there is news. Especially when there is no news.

This sounds like overkill until you have sat in the silence. When the channel goes quiet during a cutover, every person watching it fills the quiet with their own worst guess, and they fill it differently, and within twenty minutes you have a finance lead who thinks orders are down, a CS manager spinning up a crisis bridge, and an engineer who assumed someone else was handling the DNS because the channel implied a calm that was actually just absence. No news posted every fifteen minutes is not noise. It is the load-bearing message. DNS cutover complete, propagation in progress, nothing actionable, next update at 14:45. It costs the poster eight seconds and it buys everyone else the single most valuable thing you can give a nervous stakeholder, which is the knowledge that someone is awake at the controls and the silence is the good kind. Silence is information. Make sure it is information you sent on purpose, and not information your audience invented.


And here is the part the engineers tend to underweight, so we will overweight it on purpose: the people who matter most during a cutover are usually the people not in the war room.

The war room is full of the people who built the thing. They will be fine. They speak the language, they can read the dashboards, they know that the sync queue is backed up is a Tuesday and not an emergency. The people who are not fine are the customer service agent who just got a ticket from a confused customer and does not know whether to apologize or reassure, the ops lead watching the warehouse integration for the first order to flow through, the finance person whose entire relationship to launch day is a number that is supposed to be going up. They need to know what is happening. They need it in language that has had every instance of sync queue and propagation and idempotency key surgically removed. The new store is live. Orders are flowing. If a customer says their old account doesn’t work, here is the one sentence you say to them. We expect search to look a little thin for the first hour; that is known and it is fixing itself. Write that comms packet before the day, hand it to the people on the edges of the event, and you will have prevented more launch-day damage than any amount of additional monitoring, because the failure mode of a cutover is rarely just technical. It is technical-plus-someone-on-the-front-line-improvising-under-stress, and the improvisation is the part you can prevent for free.

A long corridor viewed from one end, with a closed door at the far end leaking warm light underneath it, and a solitary figure standing in the hallway with their back to the viewer, outside the room and waiting.
The war room is full of people who will be fine. The hallway is full of everyone else.

One more thing about the days before, because it is the thing that quietly sinks more cutovers than DNS ever has. In the final week, someone will want to ship one more thing.

It will be reasonable. It always is. A small copy fix. A tiny tweak to the PDP. A quick addition to the shipping rules because a buyer asked nicely. The person asking is not being foolish; the change really is small, and in any normal week you would ship it without a second thought. But the week before a cutover is not a normal week. It is the week when your staging store and your production target are supposed to be holding still long enough for the rehearsal to mean something, and every change you let through is a change the rehearsal did not cover and the runbook does not mention. So you declare a freeze, and you hold it, and you absorb the small, real, accumulating annoyance of telling good people no, not this week—and you give them somewhere for the no to go, which is the deferred-changes backlog, the list of everything that is genuinely fine and genuinely waiting and will absolutely get done starting the week after launch. The freeze is not a denial of those changes. It is a queue. People will accept almost any no if it comes with a when.


So the cutover comes, and it mostly goes the way you rehearsed, and the unrehearsed thing turns out to be small because you spent your surprises in staging where they were cheap. The orders flow. The channel fills with fifteen-minute heartbeats. The person on the plane lands and flips the key. And at some point in the late afternoon there is a quiet that is the good kind, the kind you sent on purpose, and the new store is simply the store now, the way it will be the store tomorrow and every day after, unremarkable, working.

The lemon pasta the night before was never about the pasta. It was about wanting one last ordinary, knowable thing before a day that would not be ordinary and could not be fully known. That want is not weakness and it is not a lack of confidence in the plan. It is what it feels like to care about something you cannot fully control, which is the only honest description of shipping a migration that real people will live inside. You rehearse so that the day is mostly knowable. You keep the runbook close so that the unknowable part has a witness. And you pour the wine you will not drink, and you leave it on the counter, and tomorrow—or the day after, when the heartbeats have stopped and the store is just the store—you come back and you drink it, slowly, because the thing you rehearsed for and could not fully control happened anyway, and it held.