Advanced PostgreSQL Design Patterns for Modern Applications

October 22, 2025 · 10 min read · Database

Explore powerful PostgreSQL features and design patterns including JSONB, full-text search, partitioning, and advanced indexing strategies.

Advanced PostgreSQL Design Patterns for Modern Applications

PostgreSQL is more powerful than most developers realize. Beyond basic CRUD operations, it offers features that can replace entire external services—search engines, document stores, and caching layers. Understanding these capabilities can simplify your architecture significantly.

This guide explores advanced PostgreSQL patterns I've used to build systems that scale. We'll cover JSONB for flexible schemas, native full-text search, table partitioning for large datasets, and indexing strategies that can make queries orders of magnitude faster.

Problem

As applications grow, data patterns evolve in predictable ways:

Schema flexibility — You need to store varying attributes without constant migrations
Search functionality — Users expect to search across text content
Performance degradation — Tables grow to millions of rows and queries slow down
Multi-tenant isolation — Different customers need data separation

Many teams address these by adding external services: Elasticsearch for search, MongoDB for flexible documents, Redis for caching. Each addition increases operational complexity, network latency, and failure modes.

Why This Matters

PostgreSQL can handle many of these use cases natively:

JSONB provides document-store flexibility with relational guarantees
Full-text search eliminates the need for Elasticsearch in many cases
Partitioning keeps large tables performant
Row-Level Security provides database-enforced multi-tenancy

Using native features means fewer services to operate, local data access (no network calls), and transactional consistency across all your data.

NOTE: PostgreSQL isn't always the right choice. For complex search with relevance tuning, you might still need Elasticsearch. For sub-millisecond caching, Redis excels. But know what Postgres can do before reaching for external services.

Solution

1. JSONB for Flexible Schemas

JSONB gives you document-store flexibility while maintaining SQL query capabilities and ACID guarantees.

-- Event sourcing table with flexible payload
CREATE TABLE events (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    aggregate_id UUID NOT NULL,
    event_type VARCHAR(100) NOT NULL,
    payload JSONB NOT NULL,
    metadata JSONB DEFAULT '{}'::jsonb,
    created_at TIMESTAMPTZ DEFAULT NOW()
);

-- GIN index for fast JSON queries
CREATE INDEX idx_events_payload ON events USING GIN (payload);
CREATE INDEX idx_events_aggregate ON events (aggregate_id, created_at);

Query JSONB data with containment operators:

-- Find events where payload contains specific data
SELECT * FROM events
WHERE payload @> '{"user_id": "abc-123"}'
  AND event_type = 'order.created';

-- Extract and filter on nested fields
SELECT 
    id,
    payload->>'order_id' AS order_id,
    (payload->'items'->>'count')::int AS item_count
FROM events
WHERE payload->'items'->>'count' IS NOT NULL
  AND (payload->'items'->>'count')::int > 5;

TIP: Use jsonb_path_query for complex JSON querying with SQL/JSON path language (PostgreSQL 12+).

Implementation: Event Store

-- Insert events with flexible structure
INSERT INTO events (aggregate_id, event_type, payload, metadata)
VALUES 
    (
        'order-123',
        'order.created',
        '{"customer_id": "cust-456", "items": [{"sku": "ABC", "qty": 2}], "total": 99.99}'::jsonb,
        '{"source": "web", "ip": "192.168.1.1"}'::jsonb
    ),
    (
        'order-123',
        'order.payment_received',
        '{"payment_id": "pay-789", "amount": 99.99, "method": "card"}'::jsonb,
        '{"source": "payment-service"}'::jsonb
    );

-- Reconstruct aggregate state from events
SELECT 
    aggregate_id,
    jsonb_agg(
        jsonb_build_object(
            'type', event_type,
            'data', payload,
            'at', created_at
        ) ORDER BY created_at
    ) AS event_stream
FROM events
WHERE aggregate_id = 'order-123'
GROUP BY aggregate_id;

2. Full-Text Search

PostgreSQL's full-text search handles many use cases where teams default to Elasticsearch.

-- Add search vector column
ALTER TABLE articles ADD COLUMN search_vector tsvector;

-- Create trigger for automatic updates
CREATE OR REPLACE FUNCTION update_article_search_vector()
RETURNS TRIGGER AS $
BEGIN
    NEW.search_vector := 
        setweight(to_tsvector('english', COALESCE(NEW.title, '')), 'A') ||
        setweight(to_tsvector('english', COALESCE(NEW.summary, '')), 'B') ||
        setweight(to_tsvector('english', COALESCE(NEW.content, '')), 'C');
    RETURN NEW;
END;
$ LANGUAGE plpgsql;

CREATE TRIGGER articles_search_update
    BEFORE INSERT OR UPDATE OF title, summary, content ON articles
    FOR EACH ROW EXECUTE FUNCTION update_article_search_vector();

-- GIN index for fast search
CREATE INDEX idx_articles_search ON articles USING GIN (search_vector);

Query with ranking:

-- Search with relevance ranking
SELECT 
    id,
    title,
    ts_rank(search_vector, query) AS rank,
    ts_headline(
        'english',
        content,
        query,
        'StartSel=<mark>, StopSel=</mark>, MaxWords=50'
    ) AS snippet
FROM 
    articles,
    plainto_tsquery('english', 'database performance optimization') query
WHERE search_vector @@ query
ORDER BY rank DESC
LIMIT 20;

WARNING: Full-text search works great for straightforward text matching. For complex relevance tuning, faceted search, or real-time indexing of millions of documents, consider a dedicated search engine.

3. Table Partitioning

Essential for time-series data and tables that grow unbounded.

-- Create partitioned metrics table
CREATE TABLE metrics (
    id BIGSERIAL,
    metric_name VARCHAR(100) NOT NULL,
    dimensions JSONB DEFAULT '{}'::jsonb,
    value NUMERIC NOT NULL,
    recorded_at TIMESTAMPTZ NOT NULL,
    PRIMARY KEY (id, recorded_at)
) PARTITION BY RANGE (recorded_at);

-- Create partitions (automate with pg_partman in production)
CREATE TABLE metrics_2025_q4 PARTITION OF metrics
    FOR VALUES FROM ('2025-10-01') TO ('2026-01-01');

CREATE TABLE metrics_2026_q1 PARTITION OF metrics
    FOR VALUES FROM ('2026-01-01') TO ('2026-04-01');

-- Each partition can have its own indexes
CREATE INDEX idx_metrics_2025_q4_name ON metrics_2025_q4 (metric_name, recorded_at);

Benefits of partitioning:

-- Query only touches relevant partitions (partition pruning)
EXPLAIN ANALYZE
SELECT * FROM metrics
WHERE recorded_at >= '2025-11-01' 
  AND recorded_at < '2025-12-01'
  AND metric_name = 'api.latency';

-- Drop old data by detaching/dropping partitions (instant, no row locks)
ALTER TABLE metrics DETACH PARTITION metrics_2025_q3;
DROP TABLE metrics_2025_q3;

4. Advanced Indexing Strategies

The right index can be the difference between a query that takes seconds and one that takes milliseconds.

Partial Indexes

Index only the rows you actually query:

-- Only 5% of users are admins, but we query them frequently
CREATE INDEX idx_users_admin_email 
ON users (email) 
WHERE role = 'admin';

-- Only index active records
CREATE INDEX idx_orders_active_customer 
ON orders (customer_id, created_at)
WHERE status NOT IN ('cancelled', 'archived');

Covering Indexes

Include columns to enable index-only scans:

-- The query needs customer_id, order_date, total, and status
-- INCLUDE columns aren't searchable but are in the index
CREATE INDEX idx_orders_customer_covering
ON orders (customer_id, order_date DESC)
INCLUDE (total, status);

-- This query can be satisfied entirely from the index
EXPLAIN ANALYZE
SELECT customer_id, order_date, total, status
FROM orders
WHERE customer_id = 'cust-123'
ORDER BY order_date DESC
LIMIT 10;

Expression Indexes

Index computed values:

-- Index on lowercase email for case-insensitive lookups
CREATE INDEX idx_users_email_lower 
ON users (LOWER(email));

-- Index on date extracted from timestamp
CREATE INDEX idx_orders_date 
ON orders (DATE(created_at));

-- Index on JSONB path
CREATE INDEX idx_events_customer 
ON events ((payload->>'customer_id'));

5. Row-Level Security for Multi-Tenancy

Database-enforced tenant isolation is more robust than application-level checks.

-- Enable RLS on the table
ALTER TABLE documents ENABLE ROW LEVEL SECURITY;

-- Policy uses session variable to filter rows
CREATE POLICY tenant_isolation_policy ON documents
    USING (tenant_id = current_setting('app.tenant_id')::uuid);

-- Force RLS even for table owners
ALTER TABLE documents FORCE ROW LEVEL SECURITY;

Application sets tenant context per request:

async def set_tenant_context(session: AsyncSession, tenant_id: str):
    """Set tenant context for RLS policies."""
    await session.execute(
        text("SET LOCAL app.tenant_id = :tenant_id"),
        {"tenant_id": tenant_id}
    )

TIP: Use SET LOCAL instead of SET so the setting is transaction-scoped and automatically reset.

Example: Advisory Locks

Application-level locking for coordinating distributed work:

-- Try to acquire lock (non-blocking)
SELECT pg_try_advisory_lock(hashtext('process_invoice_123'));

-- If returns true, we have the lock
-- Do work...

-- Release when done
SELECT pg_advisory_unlock(hashtext('process_invoice_123'));

In Python:

async def process_with_lock(session: AsyncSession, lock_key: str):
    """Process with advisory lock to prevent duplicate processing."""
    lock_id = hash(lock_key) & 0x7FFFFFFF  # Positive int32
    
    # Try to acquire lock
    result = await session.execute(
        text("SELECT pg_try_advisory_xact_lock(:lock_id)"),
        {"lock_id": lock_id}
    )
    acquired = result.scalar()
    
    if not acquired:
        raise ConcurrentProcessingError(f"Lock {lock_key} already held")
    
    # Lock auto-releases at transaction end
    return await do_processing()

Common Mistakes

1. Over-Indexing

Indexes speed up reads but slow down writes. Don't create indexes "just in case."

-- Check index usage
SELECT 
    schemaname, tablename, indexname,
    idx_scan AS times_used,
    pg_size_pretty(pg_relation_size(indexrelid)) AS size
FROM pg_stat_user_indexes
WHERE idx_scan = 0  -- Never used!
ORDER BY pg_relation_size(indexrelid) DESC;

2. Ignoring EXPLAIN ANALYZE

Guessing at performance is a recipe for frustration. Always analyze.

EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT * FROM orders
WHERE customer_id = 'cust-123'
  AND created_at > NOW() - INTERVAL '30 days';

3. Not Using Connection Pooling

PostgreSQL creates a new process per connection. Without pooling, you'll exhaust resources.

Use PgBouncer or similar in production—it's not optional for any serious workload.

4. Forgetting VACUUM and ANALYZE

PostgreSQL's MVCC creates dead tuples. Autovacuum should be enabled and tuned for your workload.

-- Check autovacuum status
SELECT 
    schemaname, relname,
    n_dead_tup,
    last_vacuum,
    last_autovacuum,
    last_analyze,
    last_autoanalyze
FROM pg_stat_user_tables
ORDER BY n_dead_tup DESC;

Conclusion

PostgreSQL's advanced features can eliminate entire categories of external services from your architecture. JSONB provides document flexibility with ACID guarantees. Full-text search handles many common search needs. Partitioning keeps large tables fast. Row-Level Security provides database-enforced multi-tenancy.

Before adding Elasticsearch, MongoDB, or complex caching layers, understand what PostgreSQL can do. The operational simplicity of fewer services often outweighs the specialized features of additional tools.

Master these patterns, and you'll build simpler architectures that are easier to operate and debug.

Advanced PostgreSQL Design Patterns for Modern Applications

Problem

Why This Matters

Solution

1. JSONB for Flexible Schemas

Implementation: Event Store

2. Full-Text Search

3. Table Partitioning

4. Advanced Indexing Strategies

Partial Indexes

Covering Indexes

Expression Indexes

5. Row-Level Security for Multi-Tenancy

Example: Advisory Locks

Common Mistakes

1. Over-Indexing

2. Ignoring EXPLAIN ANALYZE

3. Not Using Connection Pooling

4. Forgetting VACUUM and ANALYZE

Conclusion

Read Next