Database migrations without downtime are a superpower. Here's the playbook that's survived dozens of schema changes in production.
The core idea
Never change schema and code at the same time. Split every migration into phases where the old code and new code can both run against the same schema.
Adding a column
Phase 1: Add the column as nullable. Deploy.
Phase 2: Deploy code that writes the new column. Reads still fall back if the column is null.
Phase 3: Backfill the column for existing rows.
Phase 4: Deploy code that assumes the column is populated. Remove the fallback.
Phase 5: Make the column NOT NULL if needed.
Yes, it's 5 deploys. Yes, it's worth it. Zero downtime.
Renaming a column
You can't just rename. You have to:
Phase 1: Add the new column.
Phase 2: Write to both columns (old and new).
Phase 3: Backfill the new column from the old.
Phase 4: Switch reads to the new column.
Phase 5: Stop writing to the old column.
Phase 6: Drop the old column.
6 deploys. Yes. Unless you can take downtime, there's no shortcut.
The dangerous ones
Dropping a column. Always more dangerous than it looks. Services you didn't know about might still reference it. Grep the entire codebase. Check query logs for references.
Changing a column type. Usually means creating a new column, migrating data, switching reads, dropping the old. Do not issue ALTER COLUMN TYPE on a large table during traffic hours.
Adding indexes. CREATE INDEX CONCURRENTLY in Postgres. Without it, the table is locked. I've taken down production this way. Don't.
The backfill problem
Backfilling millions of rows with one big UPDATE will lock your table. Batch it. Update 1,000 rows at a time, sleep 100ms between batches, monitor replication lag.
A backfill that runs for 8 hours is fine. A backfill that holds a lock for 30 seconds is an outage.
The rollback plan
Write down, before starting, what the rollback looks like for each phase. For most phases it's 'revert the code deploy.' For the drop phase, rollback is harder — if you've dropped the column, you'll need to restore from backup.
The safest rule: never drop anything in the same week you stop using it. Leave the column for 2 weeks after the last read. You'll thank me when you need to rollback.
Written by Dr. Samson Tanimawo
BSc · MSc · MBA · PhD
Founder & CEO, Nova AI Ops. https://novaaiops.com











