Agentic Engineering: AI as Junior Dev via Context & RPI Loop
Treat coding agents as fast but judgment-lacking junior devs: master context engineering and research-plan-implement workflow to gain 30%+ time savings without quality loss.
Mental Model: AI Agents as Enthusiastic Junior Developers
Brendan O'Leary reframes coding agents not as autocomplete tools but as collaborators akin to junior engineers. Evolved from 2020s line-finishers to 2026 executors that break down tasks, edit files, run tests, and create PRs. This shift demands treating them as "energetic enthusiastic extremely well-read often confidently wrong junior developers"—fast, tireless, ego-free, with vast knowledge across languages/frameworks, but lacking business judgment or architectural context.
Arman, Flask creator, gained >30% daily time by directing handoffs: "we're no longer just using machines we're now working with them." O'Leary stresses articulating workflows—what to hand off vs. keep—to bridge the gap where 90% of engineers use AI but few maximize it. Blind acceptance yields "technically correct and contextually wrong" code; direction amplifies human thinking.
Quote: "think about your AI agent as an energetic enthusiastic extremely well-read often confidently wrong junior developer" (O'Leary's core mental model, explaining why agents excel at speed/breadth but fail on nuance, urging judgment as the human edge).
Context Engineering: The Art of Selective, Isolated Inputs
Context is the linchpin: expensive (tokens compound costs), degradable (quality drops >50% window fill), and poisonable (bad/outdated/mixed inputs corrupt outputs). MCP servers auto-load context, pushing into "dumb zone." Solutions: persist externally (scratchpads, agents.md), select relevant slices (file @mentions, disable unneeded MCPs), summarize/trim post-deep dives, isolate via new sessions or parallel agents.
O'Leary's intern anecdote illustrates: Wireframed iPad patient-history app in Balsamiq (Comic Sans, emoji placeholders) handed to interns yielded literal prototype. Fault: poor context curation, not juniors. Same for agents—"not giving the right context... what's important what's not."
Habits: One task/session, monitor context meter, restart with agent-written summary prompt if off-rails. Karpathy: "context engineering is a delicate art and science." Enables task separation, mirroring junior eng management.
Quote: "more context doesn't always mean better results... it can make the model actually dumber" (Highlights quality-cost tradeoffs, why selective isolation beats dumping everything).
Research-Plan-Implement Workflow: Leverage Human Thinking Upfront
Avoid "help me implement X" pitfalls—jumping to code assumes wrong, wastes time, breeds anti-AI sentiment. Instead, RPI loop:
- Research (Ask Mode): Non-executable chat-only (Kilo's "ask mode" reads files optionally). Understand codebase, data flow, paradigms, edges. Brainstorm. Output: reviewable doc aligning human/AI understanding.
- Plan: Explicit steps—files touched/created, verification tests, in/out scope. Output: step-by-step plan.md (common in repos). Use cheaper models here.
- Implement: New session with plan only. Low context, frequent Git commits (O'Leary's GitLab bias: local Git as "first PR review"). Human review each change.
Human leverage max in research/plan; Dexory: "a bad line of research can potentially be hundreds of lines of bad code." "AI can't replace thinking it can only amplify the thinking you've done." Skips demo-style code-spew; see path.lo.ai for patterns.
| Phase | Goal | Tools/Outputs | Human Role |
|---|---|---|---|
| Research | Understand system | Ask mode → research doc | Review/align |
| Plan | Outline changes | Plan.md w/ steps/tests/scope | High-leverage thinking |
| Implement | Execute | Code mode + Git commits | Approve/iterate |
Quote: "AI can't replace thinking it can only amplify the thinking you've done or the lack of thinking you haven't done" (Dexory via O'Leary; justifies RPI's upfront investment for reliable execution).
Agent Configuration: Modes, Rules, and Custom Playbooks
Tailor via modes (Kilo: ask/code/architect for role-focus), workspace rules (build/test commands, testing reqs), tunable autonomy (auto-approve reads/tests? Parallel agents? Worktrees?). Buckets:
- agents.md: De facto standard—always-loaded README: conventions, commands, reqs.
- skills.md: On-demand playbooks (e.g., changelogs, motion graphics)—reusable workflows.
Power tips (Kilo/VS Code): @mention files/commits, /commands (new task, condense context), select-code right-click. Tune as you learn; start conservative.
Iterate comfort: Begin low autonomy, expand. Git for safety nets pre-PR.
Quote: "a bad line of research can potentially be hundreds of lines of bad code" (Dexory; underscores why specialized modes/rules prevent implementation disasters).
Key Takeaways
- Adopt junior dev mental model: Hand off grunt work, retain judgment/context.
- Monitor context <50% fill; persist/select/summarize/isolate to cut costs/degradation.
- RPI loop: Spend human time on research/plan for 30%+ gains; implement in fresh, low-context sessions.
- One task/session; restart with agent summaries if derailed.
- Mandate agents.md (rules/conventions); use skills.md for repeats.
- Frequent local Git commits as agent "PR review."
- Modes limit scope: Ask for research, code for execution.
- Tune autonomy gradually; @mentions//commands accelerate.
- Check path.lo.ai for workflows; avoid code-first prompts.