Caveman Plugin Saves No Tokens in Code Gen Tasks

Caveman shortens Claude's output text by ~75% in chats but delivers 0% token savings during code implementation since thinking (Opus high effort) and code generation dominate costs (4% usage both with/without).

Caveman Shortens Communication, Not Core Costs

Caveman is a Claude Code plugin invoked via /caveman that compresses model outputs into terse, comma-separated phrases (e.g., "Plan enum service form request" instead of verbose sentences). Readme claims 65% token cuts; Reddit post shows 75% reduction on single phrases in chat scenarios. Install via simple command in Claude Code—no extra config. Outputs stay understandable while slashing wordiness, potentially useful for discussion-heavy sessions where back-and-forth eats tokens.

However, token bills stem mainly from internal reasoning (e.g., Opus 4.7 high-effort thinking) and code generation, not terminal communication. Reddit users note: "It's not the prompts that cost the money, it's the thinking" and "optimizes the cheapest part of the bill."

Benchmark Reveals Negligible Impact on Code Tasks

Tested identical prompts from project-description.md (implement API in 3-10 minutes) on Anthropic $100 plan:

MetricWithout CavemanWith Caveman
Start Usage13%13% (new session)
End Usage17% (4% delta)21% (4% delta)
Time3 min4 min
ResultFull green testsFull green tests

No savings despite shorter plans and updates (e.g., "fix tests"). Communication is minimal in autonomous code gen, so plugin adds no value here. Chats might see 30% cuts per some reports, but not transformative.

Hype Over Reality: Skip Unless Chat-Heavy

40k GitHub stars fueled viral buzz, but expect no miracles for code workflows. Invoke manually (/caveman) only in verbose discussions. Test your scenarios—author invites comments on edge cases where it shines. Avoid hype; focus tools optimizing thinking/code phases for real savings.

Summarized by x-ai/grok-4.1-fast via openrouter

4610 input / 1343 output tokens in 12723ms

© 2026 Edge