Dynamically Optimize Reasoning with Adaptive Mode
Adaptive thinking replaces deprecated manual budgets on Claude Opus 4.6, Sonnet 4.6, and default on Claude Mythos Preview. Set thinking: {type: "adaptive"} in API requests—Claude assesses request complexity to decide if/when to think, skipping for simple queries at low effort. This outperforms fixed budget_tokens on bimodal tasks and long agentic workflows by allocating reasoning precisely. It auto-enables interleaved thinking between tool calls, boosting agent performance without manual config.
Example curl:
curl https://api.anthropic.com/v1/messages \
--header "x-api-key: $ANTHROPIC_API_KEY" \
--header "anthropic-version: 2023-06-01" \
--data '{ "model": "claude-opus-4-6", "max_tokens": 16000, "thinking": {"type": "adaptive"}, "messages": [{"role": "user", "content": "Explain why the sum of two even numbers is always even."}] }'
Streaming works via thinking_delta events, matching manual mode.
Older models (Sonnet 4.5+) stick to thinking.type: "enabled" + budget_tokens.
Tune Depth with Effort Parameter
Pair adaptive with output_config: {effort: "level"} for soft guidance:
| Effort | Behavior |
|---|---|
max | Unconstrained deep thinking (Opus/Sonnet 4.6 only) |
high (default) | Always thinks deeply on complex tasks |
medium | Moderate; skips very simple queries |
low | Minimal; prioritizes speed, skips simple tasks |
Use medium/low for latency-sensitive apps. Prompt-tune via system instructions like: "Extended thinking adds latency—use only for multi-step reasoning."
max_tokens caps total (thinking + output); high/max effort risks stop_reason: "max_tokens"—increase limit or drop effort.
Control Output and Costs Effectively
Default display: "summarized" returns thinking summary (full intelligence, prevents misuse); Mythos Preview defaults to omitted—set explicitly for summary. Use display: "omitted" to skip streaming thinking entirely, speeding time-to-first-text-token (streams only signature for verification).
Example: thinking: {type: "adaptive", display: "omitted"}. Signature verifies thinking on tool-use callbacks—pass full blocks back unchanged.
Switching modes breaks prompt cache breakpoints (system/tools cache regardless). Billed for full thinking process, even if omitted/summarized—output tokens exceed visible count. Specialized system prompt auto-included.
| Mode | Use When | Config |
|---|---|---|
| Adaptive | Default for complex/agentic | {type: "adaptive"} + effort |
| Manual | Precise token control | {type: "enabled", budget_tokens: N} (deprecated on 4.6) |
| Disabled | Lowest latency | Omit or {type: "disabled"} |
Migrate from enabled/budget_tokens now—removed soon. ZDR eligible: no post-response storage.