Pricing Hike Drives Input-Dependent Cost Increases
GPT-5.5 lists input tokens at $5 per million (up from $2.50) and output at $30 per million (up from $15), doubling GPT-5.4 rates. OpenRouter's April 2026 usage logs reveal effective costs per million tokens rose 49-92% in practice:
| Input length | GPT-5.4 ($/M) | GPT-5.5 ($/M) | Change |
|---|---|---|---|
| < 2K | 4.89 | 9.37 | +92% |
| 2K-10K | 2.25 | 3.81 | +69% |
| 10K-25K | 1.42 | 2.15 | +51% |
| 25K-50K | 1.02 | 1.65 | +62% |
| 50K-128K | 0.74 | 1.10 | +49% |
| 128K+ | 0.71 | 1.31 | +85% |
OpenAI claims shorter responses offset increases, but data shows limited relief: costs nearly double for short inputs under 2,000 tokens where responses stay similar length.
Response Length Shifts Fail to Fully Compensate
For inputs over 10,000 tokens, GPT-5.5 generates 19-34% shorter responses, moderating cost growth to 49-62% in mid-ranges. However, 2,000-10,000 token inputs see 52% longer responses, amplifying hikes to 69%. Short inputs yield negligible length changes, pushing effective costs up 92%. Builders optimizing LLM calls must factor these patterns—favor longer contexts to leverage brevity gains, but test real workloads as benchmarks mislead.
Benchmarks Mask Real-World Expenses, Signaling Broader Trends
Artificial Analysis reported only 20% higher costs on benchmarks, understating production impacts from diverse tasks. Anthropic mirrored this with 30-40% Opus 4.7 price jumps from elevated token use despite flat rates. As OpenAI and Anthropic eye IPOs, expect sustained climbs; select models by actual input distributions and monitor token efficiency to control budgets in AI pipelines.