Database Migration Strategies for Production Systems

January 29, 2026 · 12 min read · Database

Safe database migration strategies including zero-downtime patterns, expand-contract, batched backfills, concurrent indexes, and rollback planning.

Database Migration Strategies for Production Systems

Database migrations in production are the most dangerous routine operation in software engineering. A bad migration can corrupt data, cause downtime, or lock tables for minutes. A good migration strategy makes schema changes boring — which is exactly what you want.

This guide covers migration patterns that work for production PostgreSQL databases with real traffic, building on the indexing and query patterns from PostgreSQL indexing strategies.

Problem

Production migrations fail because of:

Long-running locks that block reads and writes
Data transformations that take hours on large tables
Migrations that cannot be rolled back
Missing indexes on new columns causing slow queries
Deployments coupled to migration timing

Migration Tooling

Use a dedicated migration tool that tracks applied migrations:

# Supabase CLI
supabase migration new add_status_column
supabase db push

# Alternative: dbmate
dbmate new add_status_column
dbmate up

Each migration gets a timestamped file. The tool tracks which migrations have been applied. Never manually run SQL in production.

Safe Column Addition

Adding a column is safe because PostgreSQL does not rewrite the table:

-- SAFE: Adds column without locking
ALTER TABLE orders ADD COLUMN status TEXT;

-- SAFE: Default value (PostgreSQL 11+, no table rewrite)
ALTER TABLE orders ADD COLUMN priority INTEGER DEFAULT 0;

-- DANGEROUS: Adding NOT NULL without default on existing table
-- This rewrites the entire table in older PostgreSQL versions
ALTER TABLE orders ADD COLUMN status TEXT NOT NULL;

-- SAFE alternative: Add nullable, backfill, then add constraint
ALTER TABLE orders ADD COLUMN status TEXT;
UPDATE orders SET status = 'pending' WHERE status IS NULL;
ALTER TABLE orders ALTER COLUMN status SET NOT NULL;

Safe Index Creation

-- DANGEROUS: Locks the table for the entire index build
CREATE INDEX idx_orders_status ON orders(status);

-- SAFE: Builds index without blocking writes
CREATE INDEX CONCURRENTLY idx_orders_status ON orders(status);

CONCURRENTLY takes longer but does not lock the table. Always use it in production.

The Expand-Contract Pattern

For breaking schema changes, use two migrations separated by a deployment:

Phase 1: Expand (add new structure)

-- Migration 001: Add new column
ALTER TABLE users ADD COLUMN full_name TEXT;

-- Backfill from existing data
UPDATE users SET full_name = first_name || ' ' || last_name
WHERE full_name IS NULL;

Phase 2: Deploy code that writes to both columns

# Application writes to both old and new columns
async def update_user_name(user_id, first_name, last_name, full_name):
    await pool.execute("""
        UPDATE users
        SET first_name = $2,
            last_name = $3,
            full_name = $4
        WHERE id = $1
    """, user_id, first_name, last_name, full_name)

Phase 3: Contract (remove old structure)

-- Migration 002: Remove old columns (after all code uses full_name)
ALTER TABLE users DROP COLUMN first_name;
ALTER TABLE users DROP COLUMN last_name;

This pattern ensures zero downtime. The old code reads old columns. The new code reads new columns. Both work during the transition.

Large Table Backfills

Backfilling millions of rows in one UPDATE locks the table:

-- DANGEROUS: Locks the entire table
UPDATE orders SET status = 'pending' WHERE status IS NULL;

-- SAFE: Batch update
DO $
DECLARE
    batch_size INT := 10000;
    affected INT;
BEGIN
    LOOP
        UPDATE orders
        SET status = 'pending'
        WHERE id IN (
            SELECT id FROM orders
            WHERE status IS NULL
            LIMIT batch_size
            FOR UPDATE SKIP LOCKED
        );

        GET DIAGNOSTICS affected = ROW_COUNT;
        EXIT WHEN affected = 0;

        PERFORM pg_sleep(0.1);  -- Brief pause between batches
        RAISE NOTICE 'Updated % rows', affected;
    END LOOP;
END $;

Batch updates process rows in small chunks with pauses between them. SKIP LOCKED avoids blocking other operations.

Rollback Strategy

Every migration needs a rollback plan:

-- Forward migration
-- migrations/20240115_add_orders_status.sql
ALTER TABLE orders ADD COLUMN status TEXT DEFAULT 'pending';
CREATE INDEX CONCURRENTLY idx_orders_status ON orders(status);

-- Rollback
-- migrations/20240115_add_orders_status_rollback.sql
DROP INDEX CONCURRENTLY IF EXISTS idx_orders_status;
ALTER TABLE orders DROP COLUMN IF EXISTS status;

Test rollback scripts in staging before deploying to production.

Migration Checklist

Before running any production migration:

□ Tested in staging with production-sized data
□ Estimated lock duration (< 1 second for safe operations)
□ Rollback script written and tested
□ Backfill batched for large tables
□ Index creation uses CONCURRENTLY
□ NOT NULL constraints added after backfill
□ Team notified of migration window
□ Monitoring dashboard open during execution

Common Mistakes

Mistake 1: Running migrations during peak traffic. Schedule migrations during low-traffic periods. Even safe operations benefit from less contention.

Mistake 2: No rollback plan. Every migration should have a tested rollback script. When things go wrong at 2 AM, you need a one-command recovery.

Mistake 3: Coupling deploy and migrate. Deploy first, then migrate. If the migration fails, you can roll back without redeploying. If the deploy fails, the migration has not been applied.

Takeaways

Safe production migrations require zero-downtime patterns: add columns as nullable, use CONCURRENTLY for indexes, batch large backfills, and follow the expand-contract pattern for breaking changes. Always have a rollback plan. Always test with production-sized data. The goal is to make schema changes routine and boring.

Database Migration Strategies for Production Systems

Problem

Migration Tooling

Safe Column Addition

Safe Index Creation

The Expand-Contract Pattern

Phase 1: Expand (add new structure)

Phase 2: Deploy code that writes to both columns

Phase 3: Contract (remove old structure)

Large Table Backfills

Rollback Strategy

Migration Checklist

Common Mistakes

Takeaways

Read Next