Master Cursor /goal: Fix Premature Stops on Complex Tasks

Replace Dumb Loops with LLM-Judged Persistence

Cursor's /goal iterates on rough-loop style automation but swaps fixed iterations for an LLM judge that checks if the goal is met after each agent run. Enable via /features enable goal, then /goal "migrate JS to TS, verify visuals with Playwright". The agent works (e.g., 9 hours overnight on migrations), gets paused/cleared with /goal pause or /goal clear, and receives context like "Continuing toward goal: take next steps or explicitly state complete." This fixes agents declaring victory early on tasks like fixing all repo tests (often incomplete after 10-15 mins). Hermes' persist goal mirrors it. Compared to rough-loop (max iterations) or auto-research loops, /goal handles ambiguous goals like "cut Docker image 60%" by exploring approaches incrementally. Key: LLM prompt demands "no proxy signals as completion—audit shows objective achieved, no work remains," forcing self-marking as complete.

Craft Prompts with Quantifiable 'Done' and Alignment

Goals must be >1 prompt but <backlog: specify achievement, constraints, validation, stop conditions. Examples: "Migrate stack, keep screens identical (Playwright verify);" "Optimize prompts until eval score hits target, run evals per change;" "Find 20 new issues: repro, fix, branch PR, log to run/ folder." Avoid fuzzy like "fix everything"—agents quit early or spiral. Pre-start: Chat for alignment (project context, bad UX, past bugs)—Vincent ran 3 days/30 rounds/gazillion tokens on OpenClaw this way. For prototypes, reference PRD.md, create milestone tests, include ref screenshots. Quantify: 20 issues, target score, visual matches.

Tools and Extensions for Reliable Execution

npx goal-buddy generates goal.md (describes request/constraints/stops) + state.yaml (tracks tasks)—/goal @goal.md yields full games (e.g., Rain-type with image-gen assets). Side chats fork convos mid-goal. Workshop at aibuilderclub.com teaches more.

Missions for Week+/Month+ Horizons

/goal limits to hours (e.g., fails on weeks-long SEO/ROAS without quick feedback). Use /mission: mission.md defines metrics, agent hypothesizes/tests (e.g., grow Twitter to 10k: post founder-voice threads, analyze perf, schedule next in hours/weeks). Human-in-loop for big changes. Crewlet (crewlet.io) in closed beta; iterated tweets from average to high-engagement by doubling down on winners.

Replace Dumb Loops with LLM-Judged Persistence

Craft Prompts with Quantifiable 'Done' and Alignment

Tools and Extensions for Reliable Execution

Missions for Week+/Month+ Horizons

More on Edge

Customize VS Code Copilot Agents for Repeatable Workflows

Cursor Deletes 15K LoC, Replaces WorkTrees with 200 LoC Skills

Claude.md Patterns That Stop Agent Course Corrections

Logan Kilpatrick: Vibe Coding Powers Next-Gen Builders