Two-Layer Caching Slashes Rec Latency via Scoped TTLs

Avoid O(n²) Recomputations with Global Cache for Shared Data

Recommendation requests initially rebuilt the full user-item interaction matrix and item similarity graph from scratch per call, using Interaction.all() and pairwise Jaccard similarity over genres. With 30 items, tolerable; at 3,000 items, seconds of latency; at 30,000, paging on-call. Each similarity calc fetched all content, resolved genres via nested await calls to SQLite (ContentGenre.filter then Genre.get_or_none), yielding O(n²) time with DB round-trips per pair.

Global cache fixes this by storing user_items (user-to-items dict), popularity (item interaction counts), and item_similarities (top-10 similar items per content via sorted Jaccard scores >0) in memory. Rebuild only if >300s (5min) since last_update. First post-TTL request pays full cost (_build_interaction_data groups all interactions; _build_item_similarities bulk-loads genres); others read dicts instantly, no DB or pairwise math.

Stack Per-User Cache on Top for Fast Repeat Hits

Per-user cache holds final scored lists (blending 0.4 collaborative filtering via user similarity on user_items, 0.3 content similarity from item_similarities, 0.3 popularity). Check first: if user_id entry <300s old, slice to limit=5 and return without global fetch or scoring.

Miss falls to global data, generates candidates (hybrid for interacted users: collaborative + content + popularity; popularity-only for cold starts via popularity_candidates(top N)), scores, ranks, caches result with timestamp. Steady-state: most requests hit per-user layer immediately.

Surgical Invalidation Matches Data Lifetimes

On record_feedback(user_id, content_id, rating), persist to DB then del self.cache[user_id] only—evicts one entry, forces recompute on next request using current (possibly stale) global data. Global ignores feedback until TTL expiry, accepting 5min staleness since new ratings needn't instantly reshape graph for all users.

Cold users (user_id not in user_items or empty) route to popularity candidates; first interaction populates user_items, next request (post-eviction) switches to hybrid organically, applying weights for personalized scores.

Production Fixes: Eviction, Bulk Queries, Config

Add LRU or size cap to unbounded per-user self.cache. Replace nested awaits in similarities with bulk build_content_genre_map()-style query to load all genres upfront, compute Jaccard in-memory. Use env vars for TTLs over hardcodes. Trade-offs: per-user stale on feedback until global refresh, but avoids global flushes; popularity fallback ensures viability sans history.