Advanced PostgreSQL Design Patterns for Modern Applications
PostgreSQL is more powerful than most developers realize. Beyond basic CRUD operations, it offers features that can replace entire external services—search engines, document stores, and caching layers. Understanding these capabilities can simplify your architecture significantly.
This guide explores advanced PostgreSQL patterns I've used to build systems that scale. We'll cover JSONB for flexible schemas, native full-text search, table partitioning for large datasets, and indexing strategies that can make queries orders of magnitude faster.
Problem
As applications grow, data patterns evolve in predictable ways:
- Schema flexibility — You need to store varying attributes without constant migrations
- Search functionality — Users expect to search across text content
- Performance degradation — Tables grow to millions of rows and queries slow down
- Multi-tenant isolation — Different customers need data separation
Many teams address these by adding external services: Elasticsearch for search, MongoDB for flexible documents, Redis for caching. Each addition increases operational complexity, network latency, and failure modes.
Why This Matters
PostgreSQL can handle many of these use cases natively:
- JSONB provides document-store flexibility with relational guarantees
- Full-text search eliminates the need for Elasticsearch in many cases
- Partitioning keeps large tables performant
- Row-Level Security provides database-enforced multi-tenancy
Using native features means fewer services to operate, local data access (no network calls), and transactional consistency across all your data.
NOTE: PostgreSQL isn't always the right choice. For complex search with relevance tuning, you might still need Elasticsearch. For sub-millisecond caching, Redis excels. But know what Postgres can do before reaching for external services.
Solution
1. JSONB for Flexible Schemas
JSONB gives you document-store flexibility while maintaining SQL query capabilities and ACID guarantees.
-- Event sourcing table with flexible payload
CREATE TABLE events (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
aggregate_id UUID NOT NULL,
event_type VARCHAR(100) NOT NULL,
payload JSONB NOT NULL,
metadata JSONB DEFAULT '{}'::jsonb,
created_at TIMESTAMPTZ DEFAULT NOW()
);
-- GIN index for fast JSON queries
CREATE INDEX idx_events_payload ON events USING GIN (payload);
CREATE INDEX idx_events_aggregate ON events (aggregate_id, created_at);
Query JSONB data with containment operators:
-- Find events where payload contains specific data
SELECT * FROM events
WHERE payload @> '{"user_id": "abc-123"}'
AND event_type = 'order.created';
-- Extract and filter on nested fields
SELECT
id,
payload->>'order_id' AS order_id,
(payload->'items'->>'count')::int AS item_count
FROM events
WHERE payload->'items'->>'count' IS NOT NULL
AND (payload->'items'->>'count')::int > 5;
TIP: Use
jsonb_path_queryfor complex JSON querying with SQL/JSON path language (PostgreSQL 12+).
Implementation: Event Store
-- Insert events with flexible structure
INSERT INTO events (aggregate_id, event_type, payload, metadata)
VALUES
(
'order-123',
'order.created',
'{"customer_id": "cust-456", "items": [{"sku": "ABC", "qty": 2}], "total": 99.99}'::jsonb,
'{"source": "web", "ip": "192.168.1.1"}'::jsonb
),
(
'order-123',
'order.payment_received',
'{"payment_id": "pay-789", "amount": 99.99, "method": "card"}'::jsonb,
'{"source": "payment-service"}'::jsonb
);
-- Reconstruct aggregate state from events
SELECT
aggregate_id,
jsonb_agg(
jsonb_build_object(
'type', event_type,
'data', payload,
'at', created_at
) ORDER BY created_at
) AS event_stream
FROM events
WHERE aggregate_id = 'order-123'
GROUP BY aggregate_id;
2. Full-Text Search
PostgreSQL's full-text search handles many use cases where teams default to Elasticsearch.
-- Add search vector column
ALTER TABLE articles ADD COLUMN search_vector tsvector;
-- Create trigger for automatic updates
CREATE OR REPLACE FUNCTION update_article_search_vector()
RETURNS TRIGGER AS $
BEGIN
NEW.search_vector :=
setweight(to_tsvector('english', COALESCE(NEW.title, '')), 'A') ||
setweight(to_tsvector('english', COALESCE(NEW.summary, '')), 'B') ||
setweight(to_tsvector('english', COALESCE(NEW.content, '')), 'C');
RETURN NEW;
END;
$ LANGUAGE plpgsql;
CREATE TRIGGER articles_search_update
BEFORE INSERT OR UPDATE OF title, summary, content ON articles
FOR EACH ROW EXECUTE FUNCTION update_article_search_vector();
-- GIN index for fast search
CREATE INDEX idx_articles_search ON articles USING GIN (search_vector);
Query with ranking:
-- Search with relevance ranking
SELECT
id,
title,
ts_rank(search_vector, query) AS rank,
ts_headline(
'english',
content,
query,
'StartSel=<mark>, StopSel=</mark>, MaxWords=50'
) AS snippet
FROM
articles,
plainto_tsquery('english', 'database performance optimization') query
WHERE search_vector @@ query
ORDER BY rank DESC
LIMIT 20;
WARNING: Full-text search works great for straightforward text matching. For complex relevance tuning, faceted search, or real-time indexing of millions of documents, consider a dedicated search engine.
3. Table Partitioning
Essential for time-series data and tables that grow unbounded.
-- Create partitioned metrics table
CREATE TABLE metrics (
id BIGSERIAL,
metric_name VARCHAR(100) NOT NULL,
dimensions JSONB DEFAULT '{}'::jsonb,
value NUMERIC NOT NULL,
recorded_at TIMESTAMPTZ NOT NULL,
PRIMARY KEY (id, recorded_at)
) PARTITION BY RANGE (recorded_at);
-- Create partitions (automate with pg_partman in production)
CREATE TABLE metrics_2025_q4 PARTITION OF metrics
FOR VALUES FROM ('2025-10-01') TO ('2026-01-01');
CREATE TABLE metrics_2026_q1 PARTITION OF metrics
FOR VALUES FROM ('2026-01-01') TO ('2026-04-01');
-- Each partition can have its own indexes
CREATE INDEX idx_metrics_2025_q4_name ON metrics_2025_q4 (metric_name, recorded_at);
Benefits of partitioning:
-- Query only touches relevant partitions (partition pruning)
EXPLAIN ANALYZE
SELECT * FROM metrics
WHERE recorded_at >= '2025-11-01'
AND recorded_at < '2025-12-01'
AND metric_name = 'api.latency';
-- Drop old data by detaching/dropping partitions (instant, no row locks)
ALTER TABLE metrics DETACH PARTITION metrics_2025_q3;
DROP TABLE metrics_2025_q3;
4. Advanced Indexing Strategies
The right index can be the difference between a query that takes seconds and one that takes milliseconds.
Partial Indexes
Index only the rows you actually query:
-- Only 5% of users are admins, but we query them frequently
CREATE INDEX idx_users_admin_email
ON users (email)
WHERE role = 'admin';
-- Only index active records
CREATE INDEX idx_orders_active_customer
ON orders (customer_id, created_at)
WHERE status NOT IN ('cancelled', 'archived');
Covering Indexes
Include columns to enable index-only scans:
-- The query needs customer_id, order_date, total, and status
-- INCLUDE columns aren't searchable but are in the index
CREATE INDEX idx_orders_customer_covering
ON orders (customer_id, order_date DESC)
INCLUDE (total, status);
-- This query can be satisfied entirely from the index
EXPLAIN ANALYZE
SELECT customer_id, order_date, total, status
FROM orders
WHERE customer_id = 'cust-123'
ORDER BY order_date DESC
LIMIT 10;
Expression Indexes
Index computed values:
-- Index on lowercase email for case-insensitive lookups
CREATE INDEX idx_users_email_lower
ON users (LOWER(email));
-- Index on date extracted from timestamp
CREATE INDEX idx_orders_date
ON orders (DATE(created_at));
-- Index on JSONB path
CREATE INDEX idx_events_customer
ON events ((payload->>'customer_id'));
5. Row-Level Security for Multi-Tenancy
Database-enforced tenant isolation is more robust than application-level checks.
-- Enable RLS on the table
ALTER TABLE documents ENABLE ROW LEVEL SECURITY;
-- Policy uses session variable to filter rows
CREATE POLICY tenant_isolation_policy ON documents
USING (tenant_id = current_setting('app.tenant_id')::uuid);
-- Force RLS even for table owners
ALTER TABLE documents FORCE ROW LEVEL SECURITY;
Application sets tenant context per request:
async def set_tenant_context(session: AsyncSession, tenant_id: str):
"""Set tenant context for RLS policies."""
await session.execute(
text("SET LOCAL app.tenant_id = :tenant_id"),
{"tenant_id": tenant_id}
)
TIP: Use
SET LOCALinstead ofSETso the setting is transaction-scoped and automatically reset.
Example: Advisory Locks
Application-level locking for coordinating distributed work:
-- Try to acquire lock (non-blocking)
SELECT pg_try_advisory_lock(hashtext('process_invoice_123'));
-- If returns true, we have the lock
-- Do work...
-- Release when done
SELECT pg_advisory_unlock(hashtext('process_invoice_123'));
In Python:
async def process_with_lock(session: AsyncSession, lock_key: str):
"""Process with advisory lock to prevent duplicate processing."""
lock_id = hash(lock_key) & 0x7FFFFFFF # Positive int32
# Try to acquire lock
result = await session.execute(
text("SELECT pg_try_advisory_xact_lock(:lock_id)"),
{"lock_id": lock_id}
)
acquired = result.scalar()
if not acquired:
raise ConcurrentProcessingError(f"Lock {lock_key} already held")
# Lock auto-releases at transaction end
return await do_processing()
Common Mistakes
1. Over-Indexing
Indexes speed up reads but slow down writes. Don't create indexes "just in case."
-- Check index usage
SELECT
schemaname, tablename, indexname,
idx_scan AS times_used,
pg_size_pretty(pg_relation_size(indexrelid)) AS size
FROM pg_stat_user_indexes
WHERE idx_scan = 0 -- Never used!
ORDER BY pg_relation_size(indexrelid) DESC;
2. Ignoring EXPLAIN ANALYZE
Guessing at performance is a recipe for frustration. Always analyze.
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT * FROM orders
WHERE customer_id = 'cust-123'
AND created_at > NOW() - INTERVAL '30 days';
3. Not Using Connection Pooling
PostgreSQL creates a new process per connection. Without pooling, you'll exhaust resources.
Use PgBouncer or similar in production—it's not optional for any serious workload.
4. Forgetting VACUUM and ANALYZE
PostgreSQL's MVCC creates dead tuples. Autovacuum should be enabled and tuned for your workload.
-- Check autovacuum status
SELECT
schemaname, relname,
n_dead_tup,
last_vacuum,
last_autovacuum,
last_analyze,
last_autoanalyze
FROM pg_stat_user_tables
ORDER BY n_dead_tup DESC;
Conclusion
PostgreSQL's advanced features can eliminate entire categories of external services from your architecture. JSONB provides document flexibility with ACID guarantees. Full-text search handles many common search needs. Partitioning keeps large tables fast. Row-Level Security provides database-enforced multi-tenancy.
Before adding Elasticsearch, MongoDB, or complex caching layers, understand what PostgreSQL can do. The operational simplicity of fewer services often outweighs the specialized features of additional tools.
Master these patterns, and you'll build simpler architectures that are easier to operate and debug.