SPDD: Scale LLM Coding to Teams via Structured Prompts

Structured Prompt-Driven Development (SPDD) treats prompts as versioned artifacts using a REASONS canvas and workflow to make AI-generated code governable, reviewable, and reusable across teams.

Prompts as First-Class Artifacts to Bridge Individual and Team Gains

AI coding assistants boost individual developer speed, but teams face friction from ambiguous requirements turning into scaled misunderstandings, harder reviews, integration issues, and production risks. Thoughtworks' internal IT teams developed Structured Prompt-Driven Development (SPDD) to make LLM-assisted changes governable at scale. Instead of ad hoc chats, SPDD elevates prompts to version-controlled assets alongside code, capturing requirements, domain models, design intent, constraints, and tasks. This predictability enables reviews on a single artifact, not scattered logs or diffs.

The core problem: Local speed ("Ferrari engine") doesn't fix systemic roads like poor alignment. SPDD rejects freeform prompting for a structured approach, drawing from spec-driven development but evolving prompts as living specs that co-evolve with code. When code diverges, update the prompt first—enforcing a closed loop where feedback refines intent before implementation.

"It's like buying a Ferrari and driving it on muddy roads: the engine is powerful, but your arrival time is determined by road conditions and traffic." This analogy from the authors highlights why individual AI wins fail organizationally without governance.

REASONS Canvas: Abstract Intent Before Concrete Execution

SPDD's foundation is the REASONS Canvas, a seven-part prompt structure forcing clarity on intent, design, execution, and governance before code generation.

SectionFocusWhy It Matters
R - RequirementsProblem, Definition of DoneAligns on business value and success metrics.
E - EntitiesDomain model, relationshipsGrounds in shared domain language.
A - ApproachHigh-level strategySets solution direction with trade-offs.
S - StructureSystem fit, components, depsEnsures architectural consistency.
O - OperationsTask breakdown, testable stepsMakes execution concrete and verifiable.
N - NormsNaming, observability, coding standardsEnforces team conventions.
S - SafeguardsInvariants, perf limits, securityPrevents regressions.

Abstract sections (R,E,A,S) capture design before specifics; execution (O) follows; governance (N,S) bounds output. This shifts uncertainty left, compounding expertise across iterations into reusable libraries. Reviewers validate one canvas, not code alone.

Decision chain: Teams considered ad hoc vs. structured prompts. Chose REASONS because it balances expressiveness with consistency—too vague risks hallucination; too rigid stifles creativity. Trade-off: Upfront canvas time (10-30 mins) pays off in predictable generations and fewer review cycles.

Closed-Loop Workflow Powered by openspdd CLI

SPDD integrates prompts into git workflows via openspdd, a CLI tool with commands enforcing discipline:

CommandPurposeKey Benefit
spdd-storySplit requirements into INVEST storiesManages large epics.
spdd-analysisExtract domain keywords, scan code, analyze risksContextualizes without full codebase dump.
spdd-reasons-canvasBuild full canvas from analysisGenerates executable blueprint.
spdd-generateProduce code task-by-task per canvasBounded, reproducible output.
spdd-api-testCurl-based E2E testsVerifies ACs.
spdd-prompt-updateEvolve canvas on req changesReq → prompt → code.
spdd-syncBack-propagate code changes to canvasCode → prompt sync.

Workflow: Requirements → Analysis → Canvas → Code → Tests → Review → Commit. Rule: Divergence? Fix prompt first. This creates short feedback loops within iterations and cumulative context across them, turning prompts into a library.

Compared to spec-driven dev, SPDD adds governance via versioned prompts and sync mechanisms. Trade-offs: Tool overhead for small changes (skip for trivial); shines on enhancements where context matters.

"When reality diverges, fix the prompt first — then update the code." This rule from the workflow prevents intent drift, making SPDD a true closed loop.

Billing Engine Enhancement: From Static to Dynamic Pricing

Example: Enhance a token-based LLM billing engine (GitHub: token-billing, iteration-1 baseline) for model-aware, multi-plan billing.

Before: Single global rate, quota for all.

Opportunity: User feedback demands model-specific rates (e.g., fast-model $0.01/1K tokens, reasoning-model $0.03/1K), Standard plan (quota + overage), Premium (no quota, split prompt/completion billing).

Options considered: Monolith if-else vs. extensible patterns. Rejected tight coupling; chose Strategy/Factory for plans, respecting ISP/SRP.

Step chain:

  1. /spdd-story on enhancement idea → Two stories (Standard + Premium), consolidated to one with Given/When/Then ACs (e.g., Standard overage: 100K quota, 90K used, 30K fast-model → $0.20 charge).
  2. Manual clarify: Core logic (routing by plan), scope (calc only, no CRUD), DoD (4 scenarios).
  3. /spdd-analysis → Domain concepts (new: modelId, plans), risks (edge cases like negative tokens), strategy (Strategy pattern).
    • Review: Aligned on OOP principles; surfaced extra edges (e.g., unknown models → 404).
  4. /spdd-reasons-canvas → Full prompt with REASONS.
  5. /spdd-generate → Code: Added modelId validation, ModelRateRepository, PlanStrategyFactory, Standard/PremiumStrategy impls.
  6. /spdd-api-test → Curl tests for ACs.

Results: API now handles modelId, dynamic rates, plan-specific logic. Quota exhausts correctly; Premium bills splits (e.g., 10K prompt + 20K completion reasoning → $1.50). Extensible for future plans.

Trade-offs: Factory adds indirection (minor perf hit, justified by extensibility); analysis review caught risks early.

Repo diffs show full artifacts: prompts, code, tests. Replicable in ~1 hour.

"The AI's analysis largely aligned with our architectural intent; in fact, its considerations were even more comprehensive than ours in certain areas." Review insight shows AI augmenting human foresight.

Three Core Skills: Alignment, Abstraction-First, Iterative Review

Effectiveness demands:

  • Alignment: Review analysis/canvas against human understanding; catch misalignments early.
  • Abstraction-first: Define intent/design before ops; prevents premature optimization.
  • Iterative review: Treat prompts as code—peer review, refine on divergence.

These counter LLM non-determinism, turning variability into strength via governance.

"Reviews move away from 'spot the bug' toward 'check the intent.'" Captures SPDD's review shift.

Fitness and Trade-offs: Not for Every Change

Assess fit: High-context enhancements, teams new to AI (builds discipline), domains with reusable patterns. Skip for one-liners.

Trade-offs:

  • Pros: Consistency, reuse, safer scaling.
  • Cons: Prompt overhead (5-20% more upfront), learning curve, tool dependency.

Fits AI-First Software Delivery; breaks "expert-only" barrier by codifying expertise.

Key Takeaways

  • Treat prompts as git-tracked artifacts to scale AI beyond solo devs.
  • Use REASONS Canvas: Abstract (REAS) → Execute (O) → Govern (NS).
  • Enforce 'fix prompt first' on divergence for closed-loop evolution.
  • Leverage openspdd CLI for workflow: analysis → canvas → generate → sync.
  • Review at analysis/canvas stages; abstraction-first uncovers edges.
  • Ideal for enhancements: e.g., add model-aware billing via Strategy pattern.
  • Builds prompt libraries compounding team knowledge.
  • Trade-off: Upfront structure for downstream predictability.
  • Core skills: Align intents, abstract before code, iterate reviews.

Summarized by x-ai/grok-4.1-fast via openrouter

8681 input / 2953 output tokens in 26706ms

© 2026 Edge