GPT-5.5 Real Costs Rise 49-92% Over Predecessor by Input

Pricing Hike Drives Input-Dependent Cost Increases

GPT-5.5 lists input tokens at $5 per million (up from $2.50) and output at $30 per million (up from $15), doubling GPT-5.4 rates. OpenRouter's April 2026 usage logs reveal effective costs per million tokens rose 49-92% in practice:

Input length	GPT-5.4 ($/M)	GPT-5.5 ($/M)	Change
< 2K	4.89	9.37	+92%
2K-10K	2.25	3.81	+69%
10K-25K	1.42	2.15	+51%
25K-50K	1.02	1.65	+62%
50K-128K	0.74	1.10	+49%
128K+	0.71	1.31	+85%

OpenAI claims shorter responses offset increases, but data shows limited relief: costs nearly double for short inputs under 2,000 tokens where responses stay similar length.

Response Length Shifts Fail to Fully Compensate

For inputs over 10,000 tokens, GPT-5.5 generates 19-34% shorter responses, moderating cost growth to 49-62% in mid-ranges. However, 2,000-10,000 token inputs see 52% longer responses, amplifying hikes to 69%. Short inputs yield negligible length changes, pushing effective costs up 92%. Builders optimizing LLM calls must factor these patterns—favor longer contexts to leverage brevity gains, but test real workloads as benchmarks mislead.

Benchmarks Mask Real-World Expenses, Signaling Broader Trends

Artificial Analysis reported only 20% higher costs on benchmarks, understating production impacts from diverse tasks. Anthropic mirrored this with 30-40% Opus 4.7 price jumps from elevated token use despite flat rates. As OpenAI and Anthropic eye IPOs, expect sustained climbs; select models by actual input distributions and monitor token efficiency to control budgets in AI pipelines.

Pricing Hike Drives Input-Dependent Cost Increases

Response Length Shifts Fail to Fully Compensate

Benchmarks Mask Real-World Expenses, Signaling Broader Trends

More on Edge

MEMENTO: LLM Self-Notes Slash KV Cache 3x

4 Concepts Unlock How LLMs Actually Work

AI Homunculus: Superintelligence Reshapes Everything Fast

LLMs Fake Competence More Dangerously Than They Hallucinate