Previewing GPT-5.6: Sol, Terra, and Luna Models

The GPT-5.6 Model Series

OpenAI has introduced three new models in the GPT-5.6 series, each optimized for different use cases:

Sol: The flagship model, featuring a new max reasoning effort mode for deep analysis and an ultra mode that utilizes subagents to coordinate complex, multi-step tasks.
Terra: A balanced model designed for everyday professional workflows, offering performance competitive with GPT-5.5 at 50% of the cost.
Luna: An efficient, low-cost model optimized for speed and affordability.

Performance and Benchmarks

The series demonstrates significant gains in agentic workflows, particularly in technical domains:

Coding: GPT-5.6 Sol sets a new state-of-the-art on Terminal-Bench 2.1, which evaluates command-line planning and tool coordination.
Biology: On GeneBench v1, the model shows improved performance in long-horizon genomics and quantitative analysis while consuming fewer tokens.
Cybersecurity: The models show improved efficiency in vulnerability research. On ExploitBench, Sol matches the performance of the Mythos Preview while using only 33% of the output tokens. All three models show improved cyber capabilities on the ExploitGym benchmark as reasoning effort increases.

Layered Safety and Deployment

GPT-5.6 launches with a multi-layered safety architecture designed to mitigate misuse while supporting defensive security work. Key components include:

Model-Level Safeguards: Training to refuse prohibited cyber assistance, even when users attempt jailbreaking or intent-masking.
Real-Time Classifiers: A secondary layer that monitors output generation. High-risk requests trigger a pause, where a larger reasoning model reviews the context before allowing or withholding the output.
Phased Release: The models are currently in a limited preview with trusted partners, coordinated with the U.S. government. OpenAI explicitly states this is a short-term measure and does not intend for government-gated access to become the long-term standard for model releases.

According to OpenAI's internal Preparedness Framework, GPT-5.6 Sol remains below the 'Cyber Critical' threshold. While the model can identify exploitation primitives and bugs in browsers like Chromium and Firefox, it did not autonomously produce functional full-chain exploits during testing.

Edge

Previewing GPT-5.6: Sol, Terra, and Luna Models

The GPT-5.6 Model Series

Performance and Benchmarks

Layered Safety and Deployment

Cyber Preparedness

The GPT-5.6 Model Series

Performance and Benchmarks

Layered Safety and Deployment

Cyber Preparedness

More from Models & Frontier Labs

AI-ModelNet: A Networked Architecture for Collaborative AI

GLM-5.2: A New Benchmark for Open-Weight Agentic Coding

Hybrid vs. Transformer: Token-Level Performance Analysis

Building Agentic Systems with Gemini 3.1