Real head-to-head · same prompt, one shot

Claude Fable 5 vs Claude Sonnet 5

The newest Anthropic model — first Mythos-class made generally available. vs The agentic SWE frontier — 82% SWE-bench Verified, Dev Team mode.

Head-to-head verdict: Claude Fable 5 wins 22–15 with 5 ties.

Claude Fable 5 · context200K tokens

Claude Sonnet 5 · context1M tokens

Claude Fable 5 · price$10 / $50 per M tokens

Claude Sonnet 5 · price$3 / $15 per M ($2/$10 intro)

Claude Fable 5 · vendorAnthropic

Claude Sonnet 5 · vendorAnthropic

What I tested — same prompt, two models

I run the same fixed prompt set through every new model the day it drops — same string, one shot, single HTML file out — and I score the result 0–10 on whether it ran, how close it hit the brief, and how good it looked. Below is what came out when I gave the exact same prompts to Claude Fable 5 and Claude Sonnet 5, side by side, on 42 shared tasks inside the Agent Operating System.

Both models were given identical prompts inside the Agent Operating System — no help, no iteration, no "best of N" tricks. I run each prompt once, save the HTML file the model produces, and score it 0–10 on whether it ran, how close it hit the brief, and how good it looked. The scoring is mine. The verdicts below are pulled from my source comparison guides at agentos.guide where I publish every score and the reasoning behind it.

Claude Fable 5 · Selected from Agent OS for the highest-stakes one-shot work — it replaced Opus 4.8 as the safety net on hard prompts, and the full 42-task bench run now backs that call.

Claude Sonnet 5 · Reach for it in Agent OS when the job is iterative, tool-using software engineering. For one-shot visual builds, GLM 5.2 (free) beat it 4-1 here.

Side-by-side on 42 shared tasks

Click any cell to play that model's actual one-shot attempt. Medals are derived from my 0–10 scores per task (highest = 🥇, second = 🥈, third = 🥉).

Task ↓

Game

Game

Game

Game

Game

Game

Game

Neonblaster

Game

Neoncity

Game

Neonracer

🥉

🥈

Game

Nordiccrypt

Game

Outrun

🥇

🥈

Game

Pool

🥉

🥈

Game

Game

Game

Game

Game

Game

Page

Page

Sim

🥉

Sim

Boids

🥉

Sim

Cloth

🥉

Where Claude Fable 5 beat Claude Sonnet 5

The tasks where I gave Claude Fable 5 a higher 0–10 score on the same prompt — with the actual commentary from my source guides.

Aurora Visual

Claude Fable 5 8.6 · Claude Sonnet 5 2.5 (+6.1) · shader aurora curtains

What I saw: Beautiful WebGL fragment shader with layered green-blue aurora ribbons, twinkling starfield, snow-rimmed mountain silhouette, and elegant title typography — genuinely convincing northern lights. Interactive mouse sway and click surge plus the reflected glow push it to task-topping polish.

Solar Sim

Claude Fable 5 8.0 · Claude Sonnet 5 2.5 (+5.5)

What I saw: Renders cleanly with all 8 planets, elliptical orbit rings, glowing sun, Saturn's ring, and a polished title/legend UI; log-scaled distances plus interactive orbit/zoom controls are a thoughtful touch. Weaknesses are flat-colored planets (no textures) and inner planets crowding t…

Orbit Sim

Claude Fable 5 7.6 · Claude Sonnet 5 3.5 (+4.1)

What I saw: Renders cleanly with a glowing sun, colored planets, starfield and solid 3D orbit/zoom/spawn controls plus real N-body physics with softening and substeps; however the screenshot shows no visible orbital trails and a somewhat sparse, static-looking layout, keeping it strong-but-g…

Blackhole Sim

Claude Fable 5 8.7 · Claude Sonnet 5 5.0 (+3.7) · Interstellar-grade lensing

What I saw: Strong Interstellar-style render with a clean event-horizon shadow, tilted accretion disk wrapping over/under the black hole, visible Doppler beaming brightening one side, and a subtle lensed arc — all polished with tasteful nebula/starfield and typography. Minor nit is the sligh…

Wormhole Sim

Claude Fable 5 6.5 · Claude Sonnet 5 3.0 (+3.5)

What I saw: The tunnel core with depth-fading rings and colorful particles reads convincingly as a wormhole and the UI/title are clean, but the flat teal fog fills most of the frame instead of an immersive tunnel, and the oversized foreground particle squares plus the misplaced solid-tube ge…

Where Claude Sonnet 5 beat Claude Fable 5

The tasks where I gave Claude Sonnet 5 a higher 0–10 score on the same prompt — with the actual commentary from my source guides.

Crypt Game

Claude Sonnet 5 6.5 · Claude Fable 5 2.0 (+4.5)

What I saw: Renders with atmospheric HUD, functional minimap showing the maze, and a working torch-lit dungeon architecture in code, but the screenshot is too dark/muddy with no visible walls, torches, or dungeon geometry — the dim ambient/fog balance undersells the crawler and looks flat ra…

Doom Game

Claude Sonnet 5 8.0 · Claude Fable 5 6.2 (+1.8)

What I saw: Renders a clean raycaster maze with atmospheric red-lit walls, working minimap, HUD health/kills bar, crosshair and a monster visible at screen edge; solid feature set (hitscan shooting, chasing AI, damage flash, touch controls) makes it strong and shippable, though the wall shad…

Fireworks Visual

Claude Sonnet 5 8.3 · Claude Fable 5 7.2 (+1.1)

What I saw: Strong 3D scene with starfield, skyline silhouette, additive-blended particle bursts and a polished shimmering title—clearly on-brief and shippable. Particles read slightly blocky/square rather than glowing sparks, and the depth composition feels a touch flat, keeping it just shy…

Twilightvale Game

Claude Sonnet 5 3.0 · Claude Fable 5 2.5 (+0.5)

What I saw: UI overlays (title, kills, weather, HP bar, hint) render correctly but the 3D scene is completely black — no terrain, trees, player, or enemies visible, meaning the WebGL world failed to render despite solid source code. A non-rendering core makes this effectively broken for a 3D RPG task.

Game Game

Claude Sonnet 5 8.2 · Claude Fable 5 7.8 (+0.4)

What I saw: Strong, polished 3D Three.js build with clean neon aesthetic, glowing player orb, colorful octahedron collectibles, spinning obstacles, shadows, starfield, and full HUD/lives/timer loop — clearly shippable. Falls just short of the field's best due to being a fairly familiar colle…

Strengths & weaknesses I logged

Claude Fable 5

Strengths

Best solo Anthropic model on this bench — 7.72 avg beats Opus 4.8 (7.49) and Sonnet 5 (7.18)
Wins most head-to-heads vs every solo rival: beats Opus 4.8 on 26/42 tasks and GLM-5.2 on 27/42 — two one-shot crashes, not the craft, cost it the average
10 task-winner tags in a single one-shot run — shader/GPU physics is its superpower (Cornell-box path tracer 8.7, black-hole lensing 8.7, synthwave outrun 8.7)
Tops external SWE-bench Verified at 95.0% in Julian's three-dragons writeup

Trade-offs

Two one-shot black-screens (crypt, twilightvale) from three.js r128 API drift — called THREE.Geometry / CapsuleGeometry, which the pinned CDN doesn't have
Free GLM-5.2 still edges it on creative one-shots (7.77 vs 7.72) at $0 — the $10/$50 premium buys agentic depth, not one-shot visuals

Claude Sonnet 5

Strengths

82.1% SWE-bench Verified — first model past 80% on real GitHub-issue repair
Dev Team multi-agent mode + 1M context for repo-level agentic work
Precision on hard logic — won the raycaster the open-weight field kept botching

Trade-offs

One-shot creative-visual builds trail GLM 5.2 here (lost 4 of 5) — no iteration to catch its own bugs
A temporal-dead-zone bug blanked its N-body orbit sim on the first shot

Pricing & context — the spec sheet

Spec	Claude Fable 5	Claude Sonnet 5
Vendor	Anthropic	Anthropic
Context window	200,000 tokens (1M with extended thinking)	1,000,000 tokens
Price	$10 / $50 per M tokens	$3 / $15 per M ($2/$10 intro)
Pricing detail	Released alongside Mythos 5 on June 9, 2026 as the publicly-available member of the new Mythos class. Premium per-token pricing on the Anthropic API; available everywhere Opus 4.8 ships.	$3.00 input / $15.00 output per million tokens; introductory $2.00/$10.00 through 2026-08-31.
Release	2026-06-09	2026-06-30
Bench coverage	42/42 scored · avg 7.72/10	42/42 scored · avg 7.18/10

The verdict — which should you pick?

Across 42 scored shared tasks, Claude Fable 5 averaged 7.72/10, beating Claude Sonnet 5's 7.18/10 by 0.54 points. Pick Claude Fable 5 when the build has to ship on the first prompt and you can afford the trade-offs in the comparison below.

If you only run one of these inside your stack, the head-to-head average above is the call. If you can run both, my honest play is to wire Claude Fable 5 and Claude Sonnet 5 both into the Agent Operating System and dispatch each from the kanban by task type — mission-critical one-shot builds where you want anthropic's newest reasoning → Claude Fable 5, agentic software engineering — write / run / test / fix loops on real repos → Claude Sonnet 5. That's the same setup I run for the 3,600+ founders inside the AI Profit Boardroom.

FAQ — Claude Fable 5 vs Claude Sonnet 5

Which is better, Claude Fable 5 or Claude Sonnet 5?

On Goldie Bench, Claude Fable 5 averages 7.72/10 across the shared tasks, with 3 gold, 1 silver, 7 bronze overall. Claude Sonnet 5 averages 7.18/10, with 1 gold, 5 silver, 2 bronze. Claude Fable 5 wins the head-to-head 22–15.

How much does Claude Fable 5 cost vs Claude Sonnet 5?

Claude Fable 5: Released alongside Mythos 5 on June 9, 2026 as the publicly-available member of the new Mythos class. Premium per-token pricing on the Anthropic API; available everywhere Opus 4.8 ships. Claude Sonnet 5: $3.00 input / $15.00 output per million tokens; introductory $2.00/$10.00 through 2026-08-31.

What's the context window for Claude Fable 5 vs Claude Sonnet 5?

Claude Fable 5 has a 200,000 tokens (1M with extended thinking) context window. Claude Sonnet 5 has a 1,000,000 tokens context window.

When should I pick Claude Fable 5 over Claude Sonnet 5?

Pick Claude Fable 5 for: Mission-critical one-shot builds where you want Anthropic's newest reasoning; Long-context work using extended thinking up to 1M tokens; Plan-heavy multi-step tasks where intelligence in the plan matters more than the build. The trade-off is the weaknesses we logged on the bench: Two one-shot black-screens (crypt, twilightvale) from three.js r128 API drift — called THREE.Geometry / CapsuleGeometry, which the pinned CDN doesn't have; Free GLM-5.2 still edges it on creative one-shots (7.77 vs 7.72) at $0 — the $10/$50 premium buys agentic depth, not one-shot visuals.

When should I pick Claude Sonnet 5 over Claude Fable 5?

Pick Claude Sonnet 5 for: Agentic software engineering — write / run / test / fix loops on real repos; Repo-level reasoning across a 1M-token context (Dev Team multi-agent mode); Precise logic — raycasters, physics — where one-shot open models slip. The trade-off is the weaknesses we logged on the bench: One-shot creative-visual builds trail GLM 5.2 here (lost 4 of 5) — no iteration to catch its own bugs; A temporal-dead-zone bug blanked its N-body orbit sim on the first shot.

How does Goldie Bench score Claude Fable 5 vs Claude Sonnet 5?

Every demo on this page was built by Julian Goldie inside the Agent Operating System — same fixed prompt for both models, one shot, single HTML file out. Each result gets a 0–10 score on whether it ran, how close it hit the brief, and how good it looked. The highest score on each task gets gold; second gets silver; third gets bronze. See methodology for full provenance.

Related comparisons

Other head-to-heads using the same scoring system:

Claude Fable 5 vs Fusion Claude Sonnet 5 vs Fusion Claude Fable 5 vs Hermes MoA Claude Sonnet 5 vs Hermes MoA Claude Fable 5 vs Grok Claude Sonnet 5 vs Grok Claude Fable 5 vs MiniMax M3 Claude Sonnet 5 vs MiniMax M3

Full model pages: Claude Fable 5 · Claude Sonnet 5 · back to the leaderboard

The same stack Julian uses

Run this stack yourself.

Every demo on this bench was built inside the Agent Operating System — one prompt, one shot, single HTML file out. The Agent OS, the prompts, the templates, the weekly walkthroughs and 3,600+ founders shipping with it every day all live inside the AI Profit Boardroom.

3,600+founders

258documented wins

38countries

$59/momonthly

Join AIPB · $59/mo → Read the Agent OS guides →