Building Production-Ready Microservices with FastAPI
FastAPI has become the framework of choice for building high-performance Python APIs. Its combination of speed, automatic documentation, and excellent type support makes it ideal for microservices. But getting from "hello world" to production-ready requires attention to patterns and practices that aren't always obvious from the docs.
This guide covers the patterns I've learned from deploying FastAPI services that handle real traffic—the configuration management, error handling, health checks, and observability that production systems demand.
Problem
Building a FastAPI service that runs locally is straightforward. Building one that runs reliably in production is another challenge entirely. Production services need:
- Robust configuration — Type-safe settings from environment variables
- Proper error handling — Consistent error responses, not stack traces
- Health endpoints — For load balancers and orchestrators
- Structured logging — JSON logs with request correlation
- Database connections — Pooling, retries, and graceful shutdown
- Security headers — CORS, rate limiting, authentication
Without these, you'll spend late nights debugging issues that should have been prevented or detected earlier.
Why This Matters
A microservice that works on your laptop but crashes in production isn't a microservice—it's a liability. Production readiness isn't just about avoiding downtime. It's about:
- Debuggability — When something fails at 3 AM, can you figure out why?
- Observability — Can you see what's happening inside the service?
- Reliability — Can the service recover from transient failures?
- Security — Is the service hardened against common attacks?
NOTE: "Production-ready" means different things for different contexts. A startup MVP has different requirements than a financial services API. Calibrate appropriately.
Solution
Let's build a production-ready FastAPI service from the ground up.
Project Structure
service/
├── app/
│ ├── __init__.py
│ ├── main.py # Application factory
│ ├── config.py # Settings management
│ ├── api/
│ │ ├── __init__.py
│ │ ├── routes/ # Route handlers
│ │ │ ├── __init__.py
│ │ │ ├── health.py
│ │ │ └── users.py
│ │ └── dependencies.py # Dependency injection
│ ├── core/
│ │ ├── __init__.py
│ │ ├── security.py # Auth, JWT handling
│ │ ├── exceptions.py # Custom exceptions
│ │ └── middleware.py # Custom middleware
│ ├── models/
│ │ ├── domain/ # Domain entities
│ │ └── schemas/ # Pydantic models
│ ├── services/ # Business logic
│ └── repositories/ # Data access
├── tests/
├── alembic/ # Database migrations
├── Dockerfile
└── docker-compose.yml
Implementation
Configuration Management
Use Pydantic Settings for type-safe configuration with validation.
from pydantic_settings import BaseSettings, SettingsConfigDict
from functools import lru_cache
from typing import Literal
class Settings(BaseSettings):
"""Application settings loaded from environment variables."""
model_config = SettingsConfigDict(
env_file=".env",
env_file_encoding="utf-8",
case_sensitive=False
)
# Application
app_name: str = "User Service"
app_version: str = "1.0.0"
environment: Literal["development", "staging", "production"] = "development"
debug: bool = False
# Server
host: str = "0.0.0.0"
port: int = 8000
workers: int = 4
# Database
database_url: str
database_pool_size: int = 5
database_pool_max_overflow: int = 10
# Redis
redis_url: str = "redis://localhost:6379/0"
# Security
jwt_secret: str
jwt_algorithm: str = "HS256"
jwt_expiration_minutes: int = 60
# External services
email_service_url: str | None = None
@property
def is_production(self) -> bool:
return self.environment == "production"
@lru_cache
def get_settings() -> Settings:
"""Cached settings instance."""
return Settings()
WARNING: Never commit
.envfiles with real secrets. Use.env.exampleas a template and document required variables.
Application Factory
Create the FastAPI app with middleware and routes configured.
from fastapi import FastAPI
from fastapi.middleware.cors import CORSMiddleware
from contextlib import asynccontextmanager
from app.config import get_settings
from app.api.routes import health, users
from app.core.middleware import LoggingMiddleware, RequestIDMiddleware
from app.core.exceptions import register_exception_handlers
from app.database import init_db, close_db
@asynccontextmanager
async def lifespan(app: FastAPI):
"""Manage startup and shutdown events."""
settings = get_settings()
# Startup
await init_db(settings.database_url)
yield
# Shutdown
await close_db()
def create_app() -> FastAPI:
settings = get_settings()
app = FastAPI(
title=settings.app_name,
version=settings.app_version,
docs_url="/docs" if not settings.is_production else None,
redoc_url="/redoc" if not settings.is_production else None,
lifespan=lifespan
)
# Middleware (order matters - first added = outermost)
app.add_middleware(RequestIDMiddleware)
app.add_middleware(LoggingMiddleware)
app.add_middleware(
CORSMiddleware,
allow_origins=["*"] if not settings.is_production else ["https://myapp.com"],
allow_methods=["*"],
allow_headers=["*"],
)
# Routes
app.include_router(health.router)
app.include_router(users.router, prefix="/api/v1")
# Exception handlers
register_exception_handlers(app)
return app
app = create_app()
Health Checks
Health endpoints are critical for Kubernetes and load balancers.
from fastapi import APIRouter, Depends
from sqlalchemy.ext.asyncio import AsyncSession
from sqlalchemy import text
import redis.asyncio as redis
from app.api.dependencies import get_db, get_redis
router = APIRouter(tags=["health"])
@router.get("/health/live")
async def liveness():
"""Basic liveness check - is the process running?"""
return {"status": "alive"}
@router.get("/health/ready")
async def readiness(
db: AsyncSession = Depends(get_db),
cache: redis.Redis = Depends(get_redis)
):
"""Readiness check - can the service handle requests?"""
checks = {}
# Database connectivity
try:
await db.execute(text("SELECT 1"))
checks["database"] = {"status": "ok"}
except Exception as e:
checks["database"] = {"status": "error", "message": str(e)}
# Redis connectivity
try:
await cache.ping()
checks["redis"] = {"status": "ok"}
except Exception as e:
checks["redis"] = {"status": "error", "message": str(e)}
all_healthy = all(c["status"] == "ok" for c in checks.values())
return {
"status": "ready" if all_healthy else "degraded",
"checks": checks
}
TIP: Kubernetes uses liveness probes to restart unhealthy containers and readiness probes to stop routing traffic. Keep liveness checks simple and fast.
Structured Logging
JSON logs with request correlation make debugging production issues possible.
import structlog
import uuid
from fastapi import Request
from starlette.middleware.base import BaseHTTPMiddleware
import time
def configure_logging():
"""Configure structured logging with JSON output."""
structlog.configure(
processors=[
structlog.stdlib.filter_by_level,
structlog.stdlib.add_logger_name,
structlog.stdlib.add_log_level,
structlog.processors.TimeStamper(fmt="iso"),
structlog.processors.StackInfoRenderer(),
structlog.processors.format_exc_info,
structlog.processors.JSONRenderer()
],
wrapper_class=structlog.stdlib.BoundLogger,
context_class=dict,
logger_factory=structlog.stdlib.LoggerFactory(),
cache_logger_on_first_use=True
)
class RequestIDMiddleware(BaseHTTPMiddleware):
"""Adds unique request ID to each request."""
async def dispatch(self, request: Request, call_next):
request_id = request.headers.get("X-Request-ID", str(uuid.uuid4()))
request.state.request_id = request_id
response = await call_next(request)
response.headers["X-Request-ID"] = request_id
return response
class LoggingMiddleware(BaseHTTPMiddleware):
"""Logs all requests with timing and context."""
async def dispatch(self, request: Request, call_next):
logger = structlog.get_logger().bind(
request_id=getattr(request.state, "request_id", "unknown"),
method=request.method,
path=request.url.path
)
start_time = time.perf_counter()
logger.info("request_started")
try:
response = await call_next(request)
duration_ms = (time.perf_counter() - start_time) * 1000
logger.info(
"request_completed",
status_code=response.status_code,
duration_ms=round(duration_ms, 2)
)
return response
except Exception as e:
duration_ms = (time.perf_counter() - start_time) * 1000
logger.exception(
"request_failed",
duration_ms=round(duration_ms, 2),
error=str(e)
)
raise
Error Handling
Consistent error responses make APIs predictable and debuggable.
from fastapi import FastAPI, Request
from fastapi.responses import JSONResponse
from pydantic import BaseModel
from typing import Any
class ErrorDetail(BaseModel):
code: str
message: str
field: str | None = None
class ErrorResponse(BaseModel):
error: str
message: str
details: list[ErrorDetail] | None = None
request_id: str | None = None
class AppException(Exception):
"""Base exception for application errors."""
def __init__(
self,
code: str,
message: str,
status_code: int = 400,
details: list[dict] | None = None
):
self.code = code
self.message = message
self.status_code = status_code
self.details = details
super().__init__(message)
class NotFoundError(AppException):
def __init__(self, resource: str, identifier: Any):
super().__init__(
code="NOT_FOUND",
message=f"{resource} with identifier '{identifier}' not found",
status_code=404
)
class ConflictError(AppException):
def __init__(self, message: str):
super().__init__(
code="CONFLICT",
message=message,
status_code=409
)
def register_exception_handlers(app: FastAPI):
"""Register custom exception handlers."""
@app.exception_handler(AppException)
async def app_exception_handler(request: Request, exc: AppException):
return JSONResponse(
status_code=exc.status_code,
content=ErrorResponse(
error=exc.code,
message=exc.message,
details=[ErrorDetail(**d) for d in exc.details] if exc.details else None,
request_id=getattr(request.state, "request_id", None)
).model_dump(exclude_none=True)
)
@app.exception_handler(Exception)
async def generic_exception_handler(request: Request, exc: Exception):
# Log the full error internally
import structlog
logger = structlog.get_logger()
logger.exception("unhandled_exception", error=str(exc))
# Return generic error to client
return JSONResponse(
status_code=500,
content=ErrorResponse(
error="INTERNAL_ERROR",
message="An unexpected error occurred",
request_id=getattr(request.state, "request_id", None)
).model_dump(exclude_none=True)
)
Example: Complete User Route
Putting it all together with a complete route implementation:
from fastapi import APIRouter, Depends, status
from pydantic import BaseModel, EmailStr
from datetime import datetime
from app.api.dependencies import get_user_service, get_current_user
from app.services.user_service import UserService
from app.core.exceptions import NotFoundError, ConflictError
from app.models.domain.user import User
router = APIRouter(prefix="/users", tags=["users"])
class CreateUserRequest(BaseModel):
email: EmailStr
name: str
password: str
class UserResponse(BaseModel):
id: str
email: str
name: str
created_at: datetime
class Config:
from_attributes = True
@router.post("/", response_model=UserResponse, status_code=status.HTTP_201_CREATED)
async def create_user(
request: CreateUserRequest,
service: UserService = Depends(get_user_service)
):
"""Create a new user account."""
user = await service.create_user(
email=request.email,
name=request.name,
password=request.password
)
return UserResponse.model_validate(user)
@router.get("/me", response_model=UserResponse)
async def get_current_user_profile(
current_user: User = Depends(get_current_user)
):
"""Get the current authenticated user's profile."""
return UserResponse.model_validate(current_user)
@router.get("/{user_id}", response_model=UserResponse)
async def get_user(
user_id: str,
service: UserService = Depends(get_user_service)
):
"""Get a user by ID."""
user = await service.get_by_id(user_id)
if not user:
raise NotFoundError("User", user_id)
return UserResponse.model_validate(user)
Common Mistakes
1. Not Using Dependency Injection
Hardcoding dependencies makes testing difficult and violates separation of concerns.
# Wrong - hardcoded dependency
@router.get("/users/{user_id}")
async def get_user(user_id: str):
repo = UserRepository(database_session) # Where does session come from?
return await repo.get(user_id)
# Correct - dependency injection
@router.get("/users/{user_id}")
async def get_user(
user_id: str,
repo: UserRepository = Depends(get_user_repository)
):
return await repo.get(user_id)
2. Blocking the Event Loop
FastAPI is async, but synchronous database calls or CPU-bound work block the event loop.
# Wrong - blocks event loop
@router.get("/compute")
async def compute_something():
result = expensive_computation() # Blocks!
return {"result": result}
# Correct - run in thread pool
from fastapi.concurrency import run_in_threadpool
@router.get("/compute")
async def compute_something():
result = await run_in_threadpool(expensive_computation)
return {"result": result}
3. Missing Response Model Validation
Without response models, you risk leaking internal data.
# Wrong - returns ORM model directly (might include password_hash!)
@router.get("/users/{user_id}")
async def get_user(user_id: str):
return await repo.get(user_id)
# Correct - explicit response model
@router.get("/users/{user_id}", response_model=UserResponse)
async def get_user(user_id: str):
return await repo.get(user_id)
Conclusion
Production-ready FastAPI services require attention beyond the basic tutorials. Configuration management, structured logging, proper error handling, and health checks aren't optional—they're essential.
The patterns in this guide have served me well across multiple production deployments. Start with the basics: type-safe configuration, consistent error handling, and proper health checks. Add observability early—you'll thank yourself when debugging your first production incident.
FastAPI makes building high-performance APIs approachable. These patterns make them production-ready.