Caveman Plugin Saves No Tokens in Code Gen Tasks
Caveman shortens Claude's output text by ~75% in chats but delivers 0% token savings during code implementation since thinking (Opus high effort) and code generation dominate costs (4% usage both with/without).
Caveman Shortens Communication, Not Core Costs
Caveman is a Claude Code plugin invoked via /caveman that compresses model outputs into terse, comma-separated phrases (e.g., "Plan enum service form request" instead of verbose sentences). Readme claims 65% token cuts; Reddit post shows 75% reduction on single phrases in chat scenarios. Install via simple command in Claude Code—no extra config. Outputs stay understandable while slashing wordiness, potentially useful for discussion-heavy sessions where back-and-forth eats tokens.
However, token bills stem mainly from internal reasoning (e.g., Opus 4.7 high-effort thinking) and code generation, not terminal communication. Reddit users note: "It's not the prompts that cost the money, it's the thinking" and "optimizes the cheapest part of the bill."
Benchmark Reveals Negligible Impact on Code Tasks
Tested identical prompts from project-description.md (implement API in 3-10 minutes) on Anthropic $100 plan:
| Metric | Without Caveman | With Caveman |
|---|---|---|
| Start Usage | 13% | 13% (new session) |
| End Usage | 17% (4% delta) | 21% (4% delta) |
| Time | 3 min | 4 min |
| Result | Full green tests | Full green tests |
No savings despite shorter plans and updates (e.g., "fix tests"). Communication is minimal in autonomous code gen, so plugin adds no value here. Chats might see 30% cuts per some reports, but not transformative.
Hype Over Reality: Skip Unless Chat-Heavy
40k GitHub stars fueled viral buzz, but expect no miracles for code workflows. Invoke manually (/caveman) only in verbose discussions. Test your scenarios—author invites comments on edge cases where it shines. Avoid hype; focus tools optimizing thinking/code phases for real savings.