Designing Scalable Backend Systems for SaaS Platforms

· 13 min read · Backend Architecture

Architectural patterns for building scalable SaaS backends including multi-tenancy, service boundaries, API versioning, and background job processing.

Designing Scalable Backend Systems for SaaS Platforms

Most SaaS backends start as a monolith. That is the right decision. The wrong decision is not designing the monolith to be split later.

The gap between a working prototype and a system that handles 10,000 paying customers is not just about adding more servers. It is about data isolation, job processing, API versioning, and the hundred small decisions that compound into either a scalable system or a rewrite.

This guide covers the architectural patterns that let SaaS backends grow without requiring a ground-up rebuild.

Problem

SaaS backends hit predictable scaling walls:

  • Tenant data leaks between customers
  • Background jobs block API response times
  • Database queries that worked at 100 tenants fail at 10,000
  • Deployments require downtime because of tight coupling
  • Feature flags and billing logic contaminate business logic

Multi-Tenancy Patterns

Shared Database, Shared Schema

Every tenant's data lives in the same tables with a tenant_id column:

CREATE TABLE projects (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    tenant_id UUID NOT NULL REFERENCES tenants(id),
    name TEXT NOT NULL,
    created_at TIMESTAMPTZ DEFAULT now()
);

CREATE INDEX idx_projects_tenant ON projects(tenant_id);

-- Row Level Security
ALTER TABLE projects ENABLE ROW LEVEL SECURITY;

CREATE POLICY tenant_isolation ON projects
    USING (tenant_id = current_setting('app.current_tenant')::uuid);

This is the simplest and most common pattern. Supabase RLS patterns cover this approach in depth.

Shared Database, Separate Schemas

Each tenant gets their own PostgreSQL schema:

CREATE SCHEMA tenant_abc123;

CREATE TABLE tenant_abc123.projects (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    name TEXT NOT NULL,
    created_at TIMESTAMPTZ DEFAULT now()
);

Better isolation but harder to query across tenants. Use this when tenants have compliance requirements.

Service Boundaries

Split your monolith along business boundaries, not technical ones:

BAD:
  /api/v1/users      → UserService
  /api/v1/database    → DatabaseService
  /api/v1/cache       → CacheService

GOOD:
  /api/v1/auth        → AuthService (handles user auth, sessions, tokens)
  /api/v1/billing     → BillingService (subscriptions, invoices, usage)
  /api/v1/projects    → ProjectService (core product domain)

Technical boundaries create services that every other service depends on. Business boundaries create services that can evolve independently.

API Versioning

Version from day one. URL-based versioning is the simplest:

from fastapi import APIRouter

v1_router = APIRouter(prefix="/api/v1")
v2_router = APIRouter(prefix="/api/v2")

@v1_router.get("/projects")
async def list_projects_v1():
    # Original response format
    return [{"id": p.id, "name": p.name} for p in projects]

@v2_router.get("/projects")
async def list_projects_v2():
    # New format with pagination metadata
    return {
        "data": [{"id": p.id, "name": p.name} for p in projects],
        "meta": {"total": total, "page": page},
    }

Never break V1 contracts. Add new versions, deprecate old ones with headers, and set sunset dates.

Background Job Architecture

Separate request handling from job processing:

# api.py — handles HTTP requests
@app.post("/reports")
async def create_report(report: ReportRequest):
    job_id = await enqueue_job(
        "generate_report",
        tenant_id=report.tenant_id,
        params=report.dict(),
    )
    return {"job_id": job_id, "status": "queued"}

# worker.py — processes jobs independently
@job("generate_report")
async def generate_report(tenant_id: str, params: dict):
    data = await fetch_report_data(tenant_id, params)
    pdf = render_pdf(data)
    await upload_to_storage(tenant_id, pdf)
    await notify_user(tenant_id, "Report ready")

The API returns immediately. The worker processes asynchronously. The client polls or receives a webhook when the job completes.

Rate Limiting

Per-tenant rate limits prevent noisy neighbors:

from fastapi import Request, HTTPException
import redis.asyncio as aioredis

redis = aioredis.from_url("redis://localhost")

async def rate_limit(request: Request, limit: int = 100, window: int = 60):
    tenant_id = request.state.tenant_id
    key = f"rate:{tenant_id}:{int(time.time()) // window}"

    count = await redis.incr(key)
    if count == 1:
        await redis.expire(key, window)

    if count > limit:
        raise HTTPException(
            status_code=429,
            detail="Rate limit exceeded",
            headers={"Retry-After": str(window)},
        )

Database Scaling

Start with read replicas before sharding:

class DatabaseRouter:
    def __init__(self, write_pool, read_pool):
        self.write = write_pool
        self.read = read_pool

    async def execute(self, query: str, *args, read_only: bool = False):
        pool = self.read if read_only else self.write
        async with pool.acquire() as conn:
            return await conn.fetch(query, *args)

Read replicas handle reporting, search, and list endpoints. The primary handles writes. This covers most SaaS traffic patterns where reads outnumber writes 10:1.

Common Mistakes

Mistake 1: Optimizing before measuring. Profile your actual traffic patterns before choosing an architecture. Most SaaS applications never need microservices.

Mistake 2: Shared mutable state across services. If two services write to the same table, they are not separate services. They are a distributed monolith.

Mistake 3: No tenant-level monitoring. Aggregate metrics hide per-tenant issues. One tenant's heavy usage can degrade the experience for everyone else.

Takeaways

Scalable SaaS backends are built on tenant isolation, clean service boundaries, and background job processing. Start with a well-structured monolith. Split along business boundaries when the team or traffic demands it. Version your APIs from day one and implement per-tenant rate limiting early — it is much harder to add later.