Context Engineering: AI's New Literacy Over Prompts

AI's Context Limitations Demand Engineering Over Prompting

Language models suffer from a U-shaped performance curve on long inputs: they prioritize the start (primacy bias) and end (recency bias) while ignoring the middle, as shown in Liu et al. (2023) and a 2025 study linking this to training data. Humans exhibit the same primacy-recency effect in memory. Attention is zero-sum—irrelevant tokens act as an 'attention sink,' diluting focus on key info (per 2023 research). Bigger windows like Claude's 1M tokens amplify errors: too little context leaves AI ignorant; too much drowns it in noise. Prompt engineering tweaks one interaction; context engineering structures the entire environment (files, rules, identity) for every interaction, turning random chats into a self-running 'lab.'

Andrej Karpathy (ex-OpenAI) calls it 'filling the context window with just the right information.' Anthropic's guide confirms the balance is non-trivial. Unstructured blobs force AI to infer relevance; modular setups ensure it starts from intelligence, not ignorance.

Dexter Protocol: 5 Rules to Bulletproof Context

Counter these flaws with these rules, inspired by Dexter's organized lab vs. Dee Dee's chaos:

Label buttons: Every file needs a header stating purpose, when to load, and usage (e.g., '# VOICE PROFILE — ROBOTS ATE MY HOMEWORK ## Purpose: Load for ALL writing tasks'). Front-load core rules (first 10-20 lines), details in middle, constraints last. This prevents AI from parsing unstructured streams like '800 words of stream-of-consciousness.'
Lock doors: Modularize to contain damage—separate voice (300 lines max, from 1,200), brand, strategy, projects. Load only relevant files per task; zero-sum attention means less noise boosts quality.
Front-load the formula: Place non-negotiable rules (e.g., 'Never use em dashes') in first 10 lines, not buried (line 847). U-shape favors start/end.
Modules over monoliths: Limit files to <few hundred lines: identity.md (always load, <200 lines, who/what/expertise); voice.md (writing only); current-projects.md (work only, decisions/next actions). No 3,000-word prompts.
Lab runs itself: Use a routing file (router.md or SKILL.md, <50 lines) as index: always loaded, directs by task ('writing → identity + voice'; 'strategy → identity + projects'; unclear → identity + clarify). Enables progressive disclosure—small token cost, prevents overload.

These cut setup from 20 minutes to zero, eliminate drift (e.g., AI nailing first 3 paragraphs then failing).

3-File Starter + Prompts: 80% Gains in One Afternoon

Audit first (Prompt 1): Analyze chat history for repeated/missing/wasted context and position issues; outputs priority file list.

Build modules (Prompt 2): Feed raw notes into AI to generate:

identity.md (<200 lines, front-load top 20).
voice.md (rules/examples/constraints).
current-projects.md (decisions/actions/deadlines). Each with headers, scannable sections, 'do NOT' ends.

Route it (Prompt 3): Generate router.md listing files, task logic, context check before tasks.

Paste all into Claude Projects/custom GPT/Cursor. Test: Ask AI 'What do you know about me/voice/project?'—fixes gaps. Maintenance needed: Update files as projects shift. Doesn't replace strategy/taste; amplifies good thinking. Next: Layer skills (task workflows) atop for repeatable jobs.

Limits: Won't fix bad inputs; files stale without updates. Outcomes: AI executes your strategy faster, with taste-applied outputs, no re-teaching.