Decouple Fast Responses from Heavy Work with the 500ms Rule
Force all request-response logic under 500ms to prevent timeouts, connection drops, and user frustration from tasks like 500MB CSV uploads or 20-second AI inferences. Return HTTP 202 Accepted immediately after validating inputs and writing pending DB status—e.g., in e-commerce, confirm order and payment token upfront, then offload inventory sync, PDF generation, and webhooks to background processes. This builds resilient systems where browsers and Nginx stay happy, avoiding RAM spikes from resubmits.
For tasks over 1s, always background them: non-critical telemetry gets raw asyncio with safeguards; audit logs or emails use FastAPI natives; CPU-heavy math or image resizing needs multiprocessing to bypass GIL.
Fix Asyncio's GC Trap Using Reference Registries
Raw asyncio.create_task() in FastAPI or similar frameworks risks task disappearance in Python 3.10+ because aggressive garbage collection reaps unreferenced tasks mid-execution. Store tasks in a global set() as strong references to keep them alive:
import asyncio
running_tasks = set()
def run_in_background(coro):
task = asyncio.create_task(coro)
running_tasks.add(task)
task.add_done_callback(running_tasks.discard)
async def handle_request():
run_in_background(send_heavy_email("dev@example.com"))
return {"status": "Processing"}
Self-cleaning via add_done_callback prevents memory leaks. Reserve this for zero-persistence needs like pings.
Leverage FastAPI BackgroundTasks for Safe, Post-Response Execution
FastAPI's BackgroundTasks triggers after response send, sharing server memory but safer than raw asyncio—no GC worries for light tasks. Pass functions and args separately:
from fastapi import FastAPI, BackgroundTasks
app = FastAPI()
def generate_report_pdf(data: dict):
# Heavy PDF logic
pass
@app.post("/reports/generate")
async def request_report(data: dict, bg: BackgroundTasks):
bg.add_task(generate_report_pdf, data)
return {"message": "Report generation started."}
Ideal for logging or notifications, but avoid if server crashes matter—lacks persistence.
Scale Critical Tasks with Celery's Distributed Queues
For irreplaceable work like invoicing or video encoding, use Celery + Redis/RabbitMQ brokers. Web servers (producers) enqueue messages; separate workers (consumers) process them. Brokers ensure survival across restarts, enabling horizontal scaling and fault tolerance—even if the web server dies, tasks persist.
Decision Matrix:
| Method | Persistence | Scalability | Best Use |
|---|---|---|---|
| Asyncio Tasks | Zero | Low | Telemetry, pings |
| FastAPI Native | Zero | Medium | Logs, emails |
| Multiprocessing | Zero | Medium | CPU-bound (GIL escape) |
| Celery + Redis | High | High | Invoicing, migrations |
Checklist: >1s? Background it. Critical? Celery. CPU-bound? Multiprocessing. Always reference asyncio tasks.