PostgreSQL Query Optimization: From 3 Seconds to 30 Milliseconds

· 13 min read · Database & Architecture

Practical PostgreSQL optimization covering EXPLAIN ANALYZE, indexing strategies, N+1 fixes, connection pooling, and keyset pagination.

PostgreSQL Query Optimization: From 3 Seconds to 30 Milliseconds

Your API endpoint returns in 3 seconds. The application code runs in 10 milliseconds. The remaining 2,990 milliseconds are spent waiting for PostgreSQL. This is the most common backend performance problem and the most rewarding to fix.

Understanding EXPLAIN ANALYZE

Before optimizing, measure. EXPLAIN ANALYZE shows exactly what PostgreSQL does:

EXPLAIN ANALYZE
SELECT p.*, u.name as author_name
FROM posts p
JOIN users u ON u.id = p.author_id
WHERE p.published = true
ORDER BY p.created_at DESC
LIMIT 20;

Output:

Limit  (cost=1523.45..1523.50 rows=20 width=450) (actual time=2945.123..2945.130 rows=20 loops=1)
  ->  Sort  (cost=1523.45..1547.89 rows=9778 width=450) (actual time=2945.121..2945.125 rows=20 loops=1)
        Sort Key: p.created_at DESC
        ->  Hash Join  (cost=245.00..1289.56 rows=9778 width=450) (actual time=12.345..2901.234 rows=9778 loops=1)
              ->  Seq Scan on posts p  (cost=0.00..856.78 rows=9778 width=420) (actual time=0.015..2850.123 rows=9778 loops=1)
                    Filter: (published = true)
                    Rows Removed by Filter: 212

The critical line: Seq Scan on posts p with actual time=2850.123. PostgreSQL is scanning every row in the table to find published posts. This is the bottleneck.

Index Strategy

Add an index that covers the query:

CREATE INDEX idx_posts_published_created
ON posts (created_at DESC)
WHERE published = true;

This is a partial index — it only includes published posts. PostgreSQL stores a smaller, focused index that exactly matches our query pattern.

After the index:

Limit  (cost=0.42..1.56 rows=20 width=450) (actual time=0.089..0.234 rows=20 loops=1)
  ->  Index Scan using idx_posts_published_created on posts p  (cost=0.42..556.78 rows=9778 width=450) (actual time=0.087..0.228 rows=20 loops=1)

From 2,945ms to 0.23ms. The query is now 12,800x faster.

Covering Indexes

If you frequently query specific columns:

CREATE INDEX idx_posts_listing
ON posts (created_at DESC)
INCLUDE (title, excerpt, slug)
WHERE published = true;

INCLUDE adds columns to the index leaf pages. PostgreSQL can answer the query entirely from the index without reading the table — an index-only scan.

N+1 Query Problem

The most common ORM performance issue. Your code loads 20 posts, then makes 20 separate queries to load each author:

# BAD: N+1 queries (1 for posts + 20 for authors)
posts = db.query(Post).filter(Post.published == True).limit(20).all()
for post in posts:
    print(post.author.name)  # Each access triggers a query

Fix with a join:

# GOOD: 1 query with join
posts = (
    db.query(Post)
    .join(User, Post.author_id == User.id)
    .filter(Post.published == True)
    .options(joinedload(Post.author))
    .limit(20)
    .all()
)

21 queries become 1 query. Response time drops proportionally.

Connection Pooling

Every database connection costs memory (approximately 10 MB per connection in PostgreSQL). Without pooling, each API request creates and destroys a connection:

# Use connection pooling
from sqlalchemy import create_engine

engine = create_engine(
    DATABASE_URL,
    pool_size=20,
    max_overflow=10,
    pool_timeout=30,
    pool_recycle=1800,
)
  • pool_size=20: Keep 20 connections warm
  • max_overflow=10: Allow 10 extra connections under load
  • pool_recycle=1800: Recycle connections every 30 minutes to prevent stale connections

Pagination: Keyset vs. Offset

Offset pagination gets slower as page numbers increase:

-- Page 500: PostgreSQL reads and discards 9,980 rows
SELECT * FROM posts
WHERE published = true
ORDER BY created_at DESC
LIMIT 20 OFFSET 9980;

Keyset pagination is consistently fast:

-- Same page, constant performance
SELECT * FROM posts
WHERE published = true
  AND created_at < '2024-01-15T12:00:00Z'
ORDER BY created_at DESC
LIMIT 20;

Keyset pagination uses the index directly. It does not matter if you are on page 1 or page 500.

VACUUM and Table Bloat

PostgreSQL does not delete rows — it marks them as dead. Over time, the table accumulates dead rows, making scans slower:

-- Check table bloat
SELECT
  relname,
  n_dead_tup,
  n_live_tup,
  round(n_dead_tup::numeric / GREATEST(n_live_tup, 1) * 100, 2) as dead_ratio
FROM pg_stat_user_tables
WHERE n_dead_tup > 1000
ORDER BY n_dead_tup DESC;

If dead_ratio is above 20%, run:

VACUUM ANALYZE posts;

In production, configure autovacuum aggressively for high-write tables.

Monitoring

Track slow queries in production:

-- Enable query logging for queries over 100ms
ALTER SYSTEM SET log_min_duration_statement = 100;
SELECT pg_reload_conf();

Review pg_stat_statements for query patterns:

SELECT
  query,
  calls,
  mean_exec_time,
  total_exec_time
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 20;

Takeaways

Database optimization follows a pattern: measure with EXPLAIN ANALYZE, add targeted indexes, and verify the improvement. Most query performance issues are missing indexes or N+1 query patterns. Connection pooling and keyset pagination handle the rest.

The goal is not to optimize every query — it is to identify the 3–5 queries that consume 90% of your database time and fix those first.