GPT-5.6 Sol, Terra & Luna: OpenAI's Three-Tier Model Family Lands Today

July 10, 20268 min readBy Jorge Aguilar

In short

OpenAI launched GPT-5.6 Sol, Terra, and Luna on July 9, 2026. Full breakdown: pricing, ultra mode benchmarks, and which tier fits your SaaS workflow.

GPT-5.6 Sol, Terra & Luna: OpenAI's Three-Tier Model Family Lands Today

As of today, July 9, 2026, OpenAI's GPT-5.6 family is live across ChatGPT, Codex, ChatGPT Work, and the API. The flagship Sol tier just posted a 53.6 on Agent's Last Exam — beating Claude Fable 5 by 13.1 points at roughly one-quarter the estimated cost. If you're building agentic workflows, that's the number that matters.

Key takeaways:

GPT-5.6 launches as a three-tier family: Sol ($5/$30), Terra ($2.50/$15), and Luna ($1/$6) per 1M input/output tokens
Ultra mode on Sol spawns parallel subagents and scores 91.9% on Terminal-Bench 2.1 — above every published competitor
Sol set a new Agent's Last Exam record at 53.6, beating Claude Fable 5 by 13.1 points
Terra and Luna outperform Claude Fable 5 at a fraction of the price, making them strong defaults for production SaaS workloads
GPT-5.6 is the first major model to require US government review before public launch

What changed: three models instead of one

The naming is deliberate. GPT-5.6 is not one model — it's a generation. The number (5.6) signals the release cycle; the names (Sol, Terra, Luna) signal the capability tier. This is OpenAI's structural answer to Google's tiered Gemini lineup, and it fundamentally changes how you should think about model routing.

Previous GPT releases gave you one flagship and a mini for lighter work. GPT-5.6 gives you three distinct models with meaningfully different price-performance profiles — each genuinely built for different workload types rather than just being a cheaper cut of the flagship.

What Sol, Terra, and Luna each do

Sol is the one most headlines are about — and for good reason. It's the only tier that unlocks max reasoning effort and ultra mode. If you're running complex long-horizon coding tasks, agentic computer use, scientific analysis, or cybersecurity workflows, Sol is what you test first.

Terra sits at $2.50/$15 per 1M tokens and is OpenAI's recommendation for general production agent workloads. It doesn't have ultra mode, but handles the tasks most SaaS teams actually run: document parsing, customer support pipelines, code review, and RAG-backed workflows. Benchmarks show Terra outperforming Claude Fable 5 at roughly one-sixteenth the price — a number that takes a moment to process.

Luna is the speed and cost play at $1/$6 per 1M tokens. Classification, routing, tagging, extraction — tasks where per-call cost is your primary variable. Luna is fast enough to handle real-time request classification in production without burning through budget.

GPT-5.6 Sol vs Terra vs Luna pricing and features comparison table

Which GPT-5.6 tier should you actually use?

Here's how I'd frame the decision for most SaaS teams:

If your work involves multi-step reasoning, long-horizon planning, or agent pipelines that need to solve genuinely complex problems, start testing Sol. The benchmark lead over competitors is real, and the ultra mode multiplier is meaningful for tasks where parallel execution helps.

For the majority of production workloads — content generation, document analysis, API-backed workflows, customer-facing AI features — Terra is probably the right default. It costs less than half of Sol and less than half of Claude Fable 5 by token rate.

Luna makes sense when you're making thousands of quick classification or extraction calls and cost-per-call is the primary constraint. Think intake routing, tag generation, or content moderation passes at volume.

How ultra mode actually works

Ultra is the genuinely new capability in GPT-5.6 Sol. Instead of a single agent working through a complex task sequentially, ultra decomposes the request and spawns multiple subagents that work on different components in parallel. They synthesize results at the end.

On Terminal-Bench 2.1 — a real-world command-line and coding workflow benchmark — Sol scores 88.8% in standard mode. With ultra enabled, that jumps to 91.9%, a 3.1-point lift from parallel execution on complex tasks.

The trade-off is token cost. Each subagent burns tokens independently, so a single ultra call can consume several times the tokens of a standard Sol call. OpenAI also offers Sol via Cerebras at up to 750 tokens per second for latency-sensitive use cases — separately priced at $12.5/$75 per 1M tokens.

What the benchmarks actually show

The Agent's Last Exam score is the one I keep coming back to. This is a 55-field professional workflow evaluation — widely regarded as the hardest publicly available benchmark for long-running agentic tasks. GPT-5.6 Sol scored 53.6. Claude Fable 5 scored 40.5. That's a 13.1-point gap.

On Terminal-Bench 2.1, Sol Ultra leads at 91.9%, with Claude Mythos 5 at 84.3% and Claude Fable 5 at 83.4%. GPT-5.5 was at 88.0% for reference — Sol Ultra is a clear step ahead of its predecessor and well ahead of the Anthropic frontier on this benchmark.

It's worth noting that GPT-5.6 Sol received US government review before launch — the first major frontier model to do so publicly. OpenAI described the safety stack as its "most robust to date," with specific attention to high-risk requests and repeated misuse patterns.

What this means if you're building on AI right now

For SaaS teams making model routing decisions today, GPT-5.6 reshapes the calculus. Sol's lead on complex agentic benchmarks is large enough to matter for teams doing real autonomous work. Terra's price-performance — outperforming Fable 5 at a fraction of the cost — makes it a serious default for most production workloads.

If you're currently paying for Claude Fable 5 or planning to, run a cost comparison against Terra before committing. And if you're doing any agentic work where task quality directly affects product outcomes, Sol's numbers are worth a careful evaluation.

I cover how to demonstrate these kinds of model differences through AI tool video production — a three-tier family launch like this is exactly the kind of technical story that lands differently in video than in a spec sheet.

For more model comparisons and AI tools deep dives, see the MiniMax M3 review for another model that disrupted the cost-performance curve this year.

Frequently asked questions

Is GPT-5.6 Sol available right now?

Yes. GPT-5.6 Sol, Terra, and Luna went to general availability on July 9, 2026 across ChatGPT, Codex, ChatGPT Work, and the OpenAI API. Sol and Terra are also available in Codex for agentic coding tasks.

What is ultra mode and how much does it cost?

Ultra mode is exclusive to GPT-5.6 Sol and works by spawning parallel subagents to handle complex tasks simultaneously. It uses Sol's standard token rates, but each subagent burns tokens independently — a complex ultra call can cost several times more than a standard Sol call. High-speed Sol via Cerebras is separately priced at $12.5/$75 per 1M tokens.

How does GPT-5.6 compare to Claude Fable 5?

On Agent's Last Exam, Sol leads by 13.1 points (53.6 vs 40.5). On Terminal-Bench 2.1, Sol Ultra leads at 91.9% vs Fable 5's 83.4%. Terra and Luna both outperform Claude Fable 5 at roughly one-sixteenth the estimated cost — making the entire GPT-5.6 family competitive at every price tier.

OpenAI GPT-5.6 AI Models Agentic AI LLM Comparison

Tools mentioned

MiniMaxAI model provider offering capable, cost-efficient models and APIs.

Was this article helpful?

Jorge Aguilar

Founder & Creator, SaaS Master

Producing SaaS and AI product videos since 2019 — 800+ videos for 200+ brands, covering tutorials, demos, walkthroughs, and explainers. Writing here about the tools, trends, and tactics that actually move the needle. LinkedIn · About · Work with me

Building an AI product that needs a clearer onboarding flow?

Client-owned videos that make your product easy to understand — demos, walkthroughs, onboarding, and explainers.

Explore AI product video production

Related guides

More AI Tools & AI Workflows →

AI Tools

Bonsai 27B Runs a 27-Billion-Parameter Model on a Phone: What On-Device AI Changes for SaaS

AI Tools

Lovable vs Bolt vs v0 vs Replit Agent: Which AI App Builder Should You Use in 2026?

AI Tools

Kimi K3 vs DeepSeek V4 vs Qwen 3.7 Max: The Best Chinese AI Model in 2026

AI & SaaS

GPT-5.6 Sol, Terra & Luna: OpenAI's Three-Tier Model Family Lands Today

What changed: three models instead of one

What Sol, Terra, and Luna each do

Which GPT-5.6 tier should you actually use?

How ultra mode actually works

What the benchmarks actually show

What this means if you're building on AI right now

Frequently asked questions

Is GPT-5.6 Sol available right now?

What is ultra mode and how much does it cost?

How does GPT-5.6 compare to Claude Fable 5?

Building an AI product that needs a clearer onboarding flow?

Related guides

Bonsai 27B Runs a 27-Billion-Parameter Model on a Phone: What On-Device AI Changes for SaaS

Lovable vs Bolt vs v0 vs Replit Agent: Which AI App Builder Should You Use in 2026?

Kimi K3 vs DeepSeek V4 vs Qwen 3.7 Max: The Best Chinese AI Model in 2026

GEO for SaaS: How to Get Your Software Cited in AI Answers