Building Production-Ready Microservices with FastAPI

October 8, 2025 · 11 min read · Backend

A comprehensive guide to building scalable, production-ready microservices using FastAPI, including authentication, database patterns, and deployment.

Building Production-Ready Microservices with FastAPI

FastAPI has become the framework of choice for building high-performance Python APIs. Its combination of speed, automatic documentation, and excellent type support makes it ideal for microservices. But getting from "hello world" to production-ready requires attention to patterns and practices that aren't always obvious from the docs.

This guide covers the patterns I've learned from deploying FastAPI services that handle real traffic—the configuration management, error handling, health checks, and observability that production systems demand.

Problem

Building a FastAPI service that runs locally is straightforward. Building one that runs reliably in production is another challenge entirely. Production services need:

Robust configuration — Type-safe settings from environment variables
Proper error handling — Consistent error responses, not stack traces
Health endpoints — For load balancers and orchestrators
Structured logging — JSON logs with request correlation
Database connections — Pooling, retries, and graceful shutdown
Security headers — CORS, rate limiting, authentication

Without these, you'll spend late nights debugging issues that should have been prevented or detected earlier.

Why This Matters

A microservice that works on your laptop but crashes in production isn't a microservice—it's a liability. Production readiness isn't just about avoiding downtime. It's about:

Debuggability — When something fails at 3 AM, can you figure out why?
Observability — Can you see what's happening inside the service?
Reliability — Can the service recover from transient failures?
Security — Is the service hardened against common attacks?

NOTE: "Production-ready" means different things for different contexts. A startup MVP has different requirements than a financial services API. Calibrate appropriately.

Solution

Let's build a production-ready FastAPI service from the ground up.

Project Structure

service/
├── app/
│   ├── __init__.py
│   ├── main.py              # Application factory
│   ├── config.py            # Settings management
│   ├── api/
│   │   ├── __init__.py
│   │   ├── routes/          # Route handlers
│   │   │   ├── __init__.py
│   │   │   ├── health.py
│   │   │   └── users.py
│   │   └── dependencies.py  # Dependency injection
│   ├── core/
│   │   ├── __init__.py
│   │   ├── security.py      # Auth, JWT handling
│   │   ├── exceptions.py    # Custom exceptions
│   │   └── middleware.py    # Custom middleware
│   ├── models/
│   │   ├── domain/          # Domain entities
│   │   └── schemas/         # Pydantic models
│   ├── services/            # Business logic
│   └── repositories/        # Data access
├── tests/
├── alembic/                 # Database migrations
├── Dockerfile
└── docker-compose.yml

Implementation

Configuration Management

Use Pydantic Settings for type-safe configuration with validation.

from pydantic_settings import BaseSettings, SettingsConfigDict
from functools import lru_cache
from typing import Literal

class Settings(BaseSettings):
    """Application settings loaded from environment variables."""
    
    model_config = SettingsConfigDict(
        env_file=".env",
        env_file_encoding="utf-8",
        case_sensitive=False
    )
    
    # Application
    app_name: str = "User Service"
    app_version: str = "1.0.0"
    environment: Literal["development", "staging", "production"] = "development"
    debug: bool = False
    
    # Server
    host: str = "0.0.0.0"
    port: int = 8000
    workers: int = 4
    
    # Database
    database_url: str
    database_pool_size: int = 5
    database_pool_max_overflow: int = 10
    
    # Redis
    redis_url: str = "redis://localhost:6379/0"
    
    # Security
    jwt_secret: str
    jwt_algorithm: str = "HS256"
    jwt_expiration_minutes: int = 60
    
    # External services
    email_service_url: str | None = None
    
    @property
    def is_production(self) -> bool:
        return self.environment == "production"

@lru_cache
def get_settings() -> Settings:
    """Cached settings instance."""
    return Settings()

WARNING: Never commit .env files with real secrets. Use .env.example as a template and document required variables.

Application Factory

Create the FastAPI app with middleware and routes configured.

from fastapi import FastAPI
from fastapi.middleware.cors import CORSMiddleware
from contextlib import asynccontextmanager

from app.config import get_settings
from app.api.routes import health, users
from app.core.middleware import LoggingMiddleware, RequestIDMiddleware
from app.core.exceptions import register_exception_handlers
from app.database import init_db, close_db

@asynccontextmanager
async def lifespan(app: FastAPI):
    """Manage startup and shutdown events."""
    settings = get_settings()
    
    # Startup
    await init_db(settings.database_url)
    yield
    
    # Shutdown
    await close_db()

def create_app() -> FastAPI:
    settings = get_settings()
    
    app = FastAPI(
        title=settings.app_name,
        version=settings.app_version,
        docs_url="/docs" if not settings.is_production else None,
        redoc_url="/redoc" if not settings.is_production else None,
        lifespan=lifespan
    )
    
    # Middleware (order matters - first added = outermost)
    app.add_middleware(RequestIDMiddleware)
    app.add_middleware(LoggingMiddleware)
    app.add_middleware(
        CORSMiddleware,
        allow_origins=["*"] if not settings.is_production else ["https://myapp.com"],
        allow_methods=["*"],
        allow_headers=["*"],
    )
    
    # Routes
    app.include_router(health.router)
    app.include_router(users.router, prefix="/api/v1")
    
    # Exception handlers
    register_exception_handlers(app)
    
    return app

app = create_app()

Health Checks

Health endpoints are critical for Kubernetes and load balancers.

from fastapi import APIRouter, Depends
from sqlalchemy.ext.asyncio import AsyncSession
from sqlalchemy import text
import redis.asyncio as redis

from app.api.dependencies import get_db, get_redis

router = APIRouter(tags=["health"])

@router.get("/health/live")
async def liveness():
    """Basic liveness check - is the process running?"""
    return {"status": "alive"}

@router.get("/health/ready")
async def readiness(
    db: AsyncSession = Depends(get_db),
    cache: redis.Redis = Depends(get_redis)
):
    """Readiness check - can the service handle requests?"""
    checks = {}
    
    # Database connectivity
    try:
        await db.execute(text("SELECT 1"))
        checks["database"] = {"status": "ok"}
    except Exception as e:
        checks["database"] = {"status": "error", "message": str(e)}
    
    # Redis connectivity
    try:
        await cache.ping()
        checks["redis"] = {"status": "ok"}
    except Exception as e:
        checks["redis"] = {"status": "error", "message": str(e)}
    
    all_healthy = all(c["status"] == "ok" for c in checks.values())
    
    return {
        "status": "ready" if all_healthy else "degraded",
        "checks": checks
    }

TIP: Kubernetes uses liveness probes to restart unhealthy containers and readiness probes to stop routing traffic. Keep liveness checks simple and fast.

Structured Logging

JSON logs with request correlation make debugging production issues possible.

import structlog
import uuid
from fastapi import Request
from starlette.middleware.base import BaseHTTPMiddleware
import time

def configure_logging():
    """Configure structured logging with JSON output."""
    structlog.configure(
        processors=[
            structlog.stdlib.filter_by_level,
            structlog.stdlib.add_logger_name,
            structlog.stdlib.add_log_level,
            structlog.processors.TimeStamper(fmt="iso"),
            structlog.processors.StackInfoRenderer(),
            structlog.processors.format_exc_info,
            structlog.processors.JSONRenderer()
        ],
        wrapper_class=structlog.stdlib.BoundLogger,
        context_class=dict,
        logger_factory=structlog.stdlib.LoggerFactory(),
        cache_logger_on_first_use=True
    )

class RequestIDMiddleware(BaseHTTPMiddleware):
    """Adds unique request ID to each request."""
    
    async def dispatch(self, request: Request, call_next):
        request_id = request.headers.get("X-Request-ID", str(uuid.uuid4()))
        request.state.request_id = request_id
        
        response = await call_next(request)
        response.headers["X-Request-ID"] = request_id
        
        return response

class LoggingMiddleware(BaseHTTPMiddleware):
    """Logs all requests with timing and context."""
    
    async def dispatch(self, request: Request, call_next):
        logger = structlog.get_logger().bind(
            request_id=getattr(request.state, "request_id", "unknown"),
            method=request.method,
            path=request.url.path
        )
        
        start_time = time.perf_counter()
        
        logger.info("request_started")
        
        try:
            response = await call_next(request)
            duration_ms = (time.perf_counter() - start_time) * 1000
            
            logger.info(
                "request_completed",
                status_code=response.status_code,
                duration_ms=round(duration_ms, 2)
            )
            
            return response
        except Exception as e:
            duration_ms = (time.perf_counter() - start_time) * 1000
            logger.exception(
                "request_failed",
                duration_ms=round(duration_ms, 2),
                error=str(e)
            )
            raise

Error Handling

Consistent error responses make APIs predictable and debuggable.

from fastapi import FastAPI, Request
from fastapi.responses import JSONResponse
from pydantic import BaseModel
from typing import Any

class ErrorDetail(BaseModel):
    code: str
    message: str
    field: str | None = None

class ErrorResponse(BaseModel):
    error: str
    message: str
    details: list[ErrorDetail] | None = None
    request_id: str | None = None

class AppException(Exception):
    """Base exception for application errors."""
    
    def __init__(
        self,
        code: str,
        message: str,
        status_code: int = 400,
        details: list[dict] | None = None
    ):
        self.code = code
        self.message = message
        self.status_code = status_code
        self.details = details
        super().__init__(message)

class NotFoundError(AppException):
    def __init__(self, resource: str, identifier: Any):
        super().__init__(
            code="NOT_FOUND",
            message=f"{resource} with identifier '{identifier}' not found",
            status_code=404
        )

class ConflictError(AppException):
    def __init__(self, message: str):
        super().__init__(
            code="CONFLICT",
            message=message,
            status_code=409
        )

def register_exception_handlers(app: FastAPI):
    """Register custom exception handlers."""
    
    @app.exception_handler(AppException)
    async def app_exception_handler(request: Request, exc: AppException):
        return JSONResponse(
            status_code=exc.status_code,
            content=ErrorResponse(
                error=exc.code,
                message=exc.message,
                details=[ErrorDetail(**d) for d in exc.details] if exc.details else None,
                request_id=getattr(request.state, "request_id", None)
            ).model_dump(exclude_none=True)
        )
    
    @app.exception_handler(Exception)
    async def generic_exception_handler(request: Request, exc: Exception):
        # Log the full error internally
        import structlog
        logger = structlog.get_logger()
        logger.exception("unhandled_exception", error=str(exc))
        
        # Return generic error to client
        return JSONResponse(
            status_code=500,
            content=ErrorResponse(
                error="INTERNAL_ERROR",
                message="An unexpected error occurred",
                request_id=getattr(request.state, "request_id", None)
            ).model_dump(exclude_none=True)
        )

Example: Complete User Route

Putting it all together with a complete route implementation:

from fastapi import APIRouter, Depends, status
from pydantic import BaseModel, EmailStr
from datetime import datetime

from app.api.dependencies import get_user_service, get_current_user
from app.services.user_service import UserService
from app.core.exceptions import NotFoundError, ConflictError
from app.models.domain.user import User

router = APIRouter(prefix="/users", tags=["users"])

class CreateUserRequest(BaseModel):
    email: EmailStr
    name: str
    password: str

class UserResponse(BaseModel):
    id: str
    email: str
    name: str
    created_at: datetime
    
    class Config:
        from_attributes = True

@router.post("/", response_model=UserResponse, status_code=status.HTTP_201_CREATED)
async def create_user(
    request: CreateUserRequest,
    service: UserService = Depends(get_user_service)
):
    """Create a new user account."""
    user = await service.create_user(
        email=request.email,
        name=request.name,
        password=request.password
    )
    return UserResponse.model_validate(user)

@router.get("/me", response_model=UserResponse)
async def get_current_user_profile(
    current_user: User = Depends(get_current_user)
):
    """Get the current authenticated user's profile."""
    return UserResponse.model_validate(current_user)

@router.get("/{user_id}", response_model=UserResponse)
async def get_user(
    user_id: str,
    service: UserService = Depends(get_user_service)
):
    """Get a user by ID."""
    user = await service.get_by_id(user_id)
    if not user:
        raise NotFoundError("User", user_id)
    return UserResponse.model_validate(user)

Common Mistakes

1. Not Using Dependency Injection

Hardcoding dependencies makes testing difficult and violates separation of concerns.

# Wrong - hardcoded dependency
@router.get("/users/{user_id}")
async def get_user(user_id: str):
    repo = UserRepository(database_session)  # Where does session come from?
    return await repo.get(user_id)

# Correct - dependency injection
@router.get("/users/{user_id}")
async def get_user(
    user_id: str,
    repo: UserRepository = Depends(get_user_repository)
):
    return await repo.get(user_id)

2. Blocking the Event Loop

FastAPI is async, but synchronous database calls or CPU-bound work block the event loop.

# Wrong - blocks event loop
@router.get("/compute")
async def compute_something():
    result = expensive_computation()  # Blocks!
    return {"result": result}

# Correct - run in thread pool
from fastapi.concurrency import run_in_threadpool

@router.get("/compute")
async def compute_something():
    result = await run_in_threadpool(expensive_computation)
    return {"result": result}

3. Missing Response Model Validation

Without response models, you risk leaking internal data.

# Wrong - returns ORM model directly (might include password_hash!)
@router.get("/users/{user_id}")
async def get_user(user_id: str):
    return await repo.get(user_id)

# Correct - explicit response model
@router.get("/users/{user_id}", response_model=UserResponse)
async def get_user(user_id: str):
    return await repo.get(user_id)

Conclusion

Production-ready FastAPI services require attention beyond the basic tutorials. Configuration management, structured logging, proper error handling, and health checks aren't optional—they're essential.

The patterns in this guide have served me well across multiple production deployments. Start with the basics: type-safe configuration, consistent error handling, and proper health checks. Add observability early—you'll thank yourself when debugging your first production incident.

FastAPI makes building high-performance APIs approachable. These patterns make them production-ready.

Building Production-Ready Microservices with FastAPI

Problem

Why This Matters

Solution

Project Structure

Implementation

Configuration Management

Application Factory

Health Checks

Structured Logging

Error Handling

Example: Complete User Route

Common Mistakes

1. Not Using Dependency Injection

2. Blocking the Event Loop

3. Missing Response Model Validation

Conclusion

Read Next