Designing APIs That Scale: Practical REST API Architecture
Every API starts simple. A few endpoints, a few users, straightforward responses. Then features accumulate, clients multiply, and the decisions you made in week one become the constraints you fight for years.
Good API design is not about following REST to the letter. It is about building interfaces that are predictable, efficient, and possible to evolve without breaking existing clients. This guide covers the practical patterns that matter.
Problem
APIs fail at scale because of:
- Inconsistent response formats across endpoints
- No pagination strategy, forcing clients to handle unbounded responses
- Breaking changes deployed without versioning
- N+1 query patterns hidden behind nested resource endpoints
- No rate limiting until abuse happens
URL Design
Resources are nouns. Actions are HTTP methods:
GET /api/v1/projects → List projects
POST /api/v1/projects → Create project
GET /api/v1/projects/:id → Get project
PATCH /api/v1/projects/:id → Update project
DELETE /api/v1/projects/:id → Delete project
GET /api/v1/projects/:id/tasks → List tasks for project
POST /api/v1/projects/:id/tasks → Create task in project
Keep nesting to two levels maximum. Deeper nesting creates brittle URLs and confuses clients about which resource they are operating on.
Consistent Response Format
Every endpoint returns the same envelope:
{
"data": { "id": "abc", "name": "My Project" },
"meta": { "request_id": "req_123" }
}
{
"data": [{ "id": "abc" }, { "id": "def" }],
"meta": { "total": 42, "page": 1, "per_page": 20 }
}
{
"error": {
"code": "VALIDATION_ERROR",
"message": "Name is required",
"details": [{ "field": "name", "reason": "required" }]
}
}
Clients always check for data or error. No exceptions.
Pagination
Cursor-based pagination for feeds, offset-based for admin panels:
from pydantic import BaseModel
from typing import Optional
class PaginationParams(BaseModel):
cursor: Optional[str] = None
limit: int = 20
@app.get("/api/v1/projects")
async def list_projects(params: PaginationParams = Depends()):
query = "SELECT * FROM projects"
args = []
if params.cursor:
query += " WHERE id < $1"
args.append(params.cursor)
query += " ORDER BY id DESC LIMIT quot; + str(len(args) + 1)
args.append(params.limit + 1)
rows = await pool.fetch(query, *args)
has_more = len(rows) > params.limit
items = rows[:params.limit]
return {
"data": items,
"meta": {
"has_more": has_more,
"next_cursor": items[-1]["id"] if has_more else None,
},
}
Cursor pagination does not break when rows are inserted between pages. Offset pagination does.
Filtering and Sorting
Support common patterns through query parameters:
GET /api/v1/projects?status=active&sort=-created_at&fields=id,name
ALLOWED_SORTS = {"created_at", "name", "updated_at"}
ALLOWED_FILTERS = {"status", "owner_id", "category"}
@app.get("/api/v1/projects")
async def list_projects(
status: Optional[str] = None,
sort: Optional[str] = None,
fields: Optional[str] = None,
):
# Validate sort field
sort_field = "created_at"
sort_dir = "DESC"
if sort:
if sort.startswith("-"):
sort_dir = "DESC"
sort_field = sort[1:]
else:
sort_dir = "ASC"
sort_field = sort
if sort_field not in ALLOWED_SORTS:
raise HTTPException(400, f"Invalid sort field: {sort_field}")
# Build query safely
# ... (parameterized query construction)
Whitelist allowed fields. Never interpolate user input into SQL.
Error Handling
Return actionable errors with consistent structure:
from fastapi import HTTPException
from fastapi.responses import JSONResponse
@app.exception_handler(HTTPException)
async def http_exception_handler(request, exc):
return JSONResponse(
status_code=exc.status_code,
content={
"error": {
"code": error_code_from_status(exc.status_code),
"message": exc.detail,
}
},
)
def error_code_from_status(status: int) -> str:
codes = {
400: "BAD_REQUEST",
401: "UNAUTHORIZED",
403: "FORBIDDEN",
404: "NOT_FOUND",
429: "RATE_LIMITED",
}
return codes.get(status, "INTERNAL_ERROR")
Idempotency
Make write operations safe to retry:
@app.post("/api/v1/payments")
async def create_payment(
payment: PaymentCreate,
idempotency_key: str = Header(..., alias="Idempotency-Key"),
):
# Check if this key was already processed
existing = await pool.fetchrow(
"SELECT response FROM idempotency_keys WHERE key = $1",
idempotency_key,
)
if existing:
return json.loads(existing["response"])
# Process the payment
result = await process_payment(payment)
# Store the result for future retries
await pool.execute(
"INSERT INTO idempotency_keys (key, response) VALUES ($1, $2)",
idempotency_key, json.dumps(result),
)
return result
Clients send the same Idempotency-Key on retries. The server returns the cached response instead of processing again.
Common Mistakes
Mistake 1: Versioning through headers. URL-based versioning (/v1/, /v2/) is visible, debuggable, and cacheable. Header-based versioning is none of those.
Mistake 2: Sending the whole object on updates. Use PATCH for partial updates. PUT replaces the entire resource. Most clients only change one or two fields.
Mistake 3: No rate limiting. Without limits, one abusive client can take down your API for everyone. Implement per-key limits from the start.
Takeaways
Scalable API design comes from consistency and predictability. Use a standard response envelope, cursor-based pagination, whitelisted filtering, and idempotent write operations. Version your URLs, return actionable errors, and rate-limit from day one.