Real head-to-head · same prompt, one shot

Claude Fable 5 vs Claude Sonnet 5

The newest Anthropic model — first Mythos-class made generally available. vs The agentic SWE frontier — 82% SWE-bench Verified, Dev Team mode.

Head-to-head verdict: Claude Fable 5 wins 22–15 with 5 ties.

Claude Fable 5 · context200K tokens
Claude Sonnet 5 · context1M tokens
Claude Fable 5 · price$10 / $50 per M tokens
Claude Sonnet 5 · price$3 / $15 per M ($2/$10 intro)
Claude Fable 5 · vendorAnthropic
Claude Sonnet 5 · vendorAnthropic

What I tested — same prompt, two models

I run the same fixed prompt set through every new model the day it drops — same string, one shot, single HTML file out — and I score the result 0–10 on whether it ran, how close it hit the brief, and how good it looked. Below is what came out when I gave the exact same prompts to Claude Fable 5 and Claude Sonnet 5, side by side, on 42 shared tasks inside the Agent Operating System.

Both models were given identical prompts inside the Agent Operating System — no help, no iteration, no "best of N" tricks. I run each prompt once, save the HTML file the model produces, and score it 0–10 on whether it ran, how close it hit the brief, and how good it looked. The scoring is mine. The verdicts below are pulled from my source comparison guides at agentos.guide where I publish every score and the reasoning behind it.

Claude Fable 5 · Selected from Agent OS for the highest-stakes one-shot work — it replaced Opus 4.8 as the safety net on hard prompts, and the full 42-task bench run now backs that call.

Claude Sonnet 5 · Reach for it in Agent OS when the job is iterative, tool-using software engineering. For one-shot visual builds, GLM 5.2 (free) beat it 4-1 here.

Side-by-side on 42 shared tasks

Click any cell to play that model's actual one-shot attempt. Medals are derived from my 0–10 scores per task (highest = 🥇, second = 🥈, third = 🥉).

Task ↓
Claude Fable 5
Claude Sonnet 5
Game
Claude Fable 5 on Arcade
Claude Sonnet 5 on Arcade
Game
Claude Fable 5 on Crypt
Claude Sonnet 5 on Crypt
Game
Claude Fable 5 on Dogfight
Claude Sonnet 5 on Dogfight
Game
Claude Fable 5 on Doom
Claude Sonnet 5 on Doom
Claude Fable 5 on Dragonflight
Claude Sonnet 5 on Dragonflight
Claude Fable 5 on Dragonrealm
Claude Sonnet 5 on Dragonrealm
Game
Claude Fable 5 on Game
Claude Sonnet 5 on Game
Claude Fable 5 on Neonblaster
Claude Sonnet 5 on Neonblaster
Game
Claude Fable 5 on Neoncity
Claude Sonnet 5 on Neoncity
Game
🥉Claude Fable 5 on Neonracer
🥈Claude Sonnet 5 on Neonracer
Claude Fable 5 on Nordiccrypt
Claude Sonnet 5 on Nordiccrypt
Game
🥇Claude Fable 5 on Outrun
🥈Claude Sonnet 5 on Outrun
Game
🥉Claude Fable 5 on Pool
🥈Claude Sonnet 5 on Pool
Game
Claude Fable 5 on Racing
Claude Sonnet 5 on Racing
Game
Claude Fable 5 on Raycaster
Claude Sonnet 5 on Raycaster
Game
Claude Fable 5 on Rpg
Claude Sonnet 5 on Rpg
Game
Claude Fable 5 on Skyrim
Claude Sonnet 5 on Skyrim
Claude Fable 5 on Twilightvale
Claude Sonnet 5 on Twilightvale
Game
Claude Fable 5 on Voxelcraft
Claude Sonnet 5 on Voxelcraft
Page
Claude Fable 5 on Landing
Claude Sonnet 5 on Landing
Page
Claude Fable 5 on Webos
Claude Sonnet 5 on Webos
Sim
🥉Claude Fable 5 on Blackhole
Claude Sonnet 5 on Blackhole
Sim
🥉Claude Fable 5 on Boids
Claude Sonnet 5 on Boids
Sim
🥉Claude Fable 5 on Cloth
Claude Sonnet 5 on Cloth

Where Claude Fable 5 beat Claude Sonnet 5

The tasks where I gave Claude Fable 5 a higher 0–10 score on the same prompt — with the actual commentary from my source guides.

Aurora Visual
Claude Fable 5 8.6 · Claude Sonnet 5 2.5 (+6.1) · shader aurora curtains

What I saw: Beautiful WebGL fragment shader with layered green-blue aurora ribbons, twinkling starfield, snow-rimmed mountain silhouette, and elegant title typography — genuinely convincing northern lights. Interactive mouse sway and click surge plus the reflected glow push it to task-topping polish.

Solar Sim
Claude Fable 5 8.0 · Claude Sonnet 5 2.5 (+5.5)

What I saw: Renders cleanly with all 8 planets, elliptical orbit rings, glowing sun, Saturn's ring, and a polished title/legend UI; log-scaled distances plus interactive orbit/zoom controls are a thoughtful touch. Weaknesses are flat-colored planets (no textures) and inner planets crowding t…

Orbit Sim
Claude Fable 5 7.6 · Claude Sonnet 5 3.5 (+4.1)

What I saw: Renders cleanly with a glowing sun, colored planets, starfield and solid 3D orbit/zoom/spawn controls plus real N-body physics with softening and substeps; however the screenshot shows no visible orbital trails and a somewhat sparse, static-looking layout, keeping it strong-but-g…

Blackhole Sim
Claude Fable 5 8.7 · Claude Sonnet 5 5.0 (+3.7) · Interstellar-grade lensing

What I saw: Strong Interstellar-style render with a clean event-horizon shadow, tilted accretion disk wrapping over/under the black hole, visible Doppler beaming brightening one side, and a subtle lensed arc — all polished with tasteful nebula/starfield and typography. Minor nit is the sligh…

Wormhole Sim
Claude Fable 5 6.5 · Claude Sonnet 5 3.0 (+3.5)

What I saw: The tunnel core with depth-fading rings and colorful particles reads convincingly as a wormhole and the UI/title are clean, but the flat teal fog fills most of the frame instead of an immersive tunnel, and the oversized foreground particle squares plus the misplaced solid-tube ge…

Where Claude Sonnet 5 beat Claude Fable 5

The tasks where I gave Claude Sonnet 5 a higher 0–10 score on the same prompt — with the actual commentary from my source guides.

Crypt Game
Claude Sonnet 5 6.5 · Claude Fable 5 2.0 (+4.5)

What I saw: Renders with atmospheric HUD, functional minimap showing the maze, and a working torch-lit dungeon architecture in code, but the screenshot is too dark/muddy with no visible walls, torches, or dungeon geometry — the dim ambient/fog balance undersells the crawler and looks flat ra…

Doom Game
Claude Sonnet 5 8.0 · Claude Fable 5 6.2 (+1.8)

What I saw: Renders a clean raycaster maze with atmospheric red-lit walls, working minimap, HUD health/kills bar, crosshair and a monster visible at screen edge; solid feature set (hitscan shooting, chasing AI, damage flash, touch controls) makes it strong and shippable, though the wall shad…

Fireworks Visual
Claude Sonnet 5 8.3 · Claude Fable 5 7.2 (+1.1)

What I saw: Strong 3D scene with starfield, skyline silhouette, additive-blended particle bursts and a polished shimmering title—clearly on-brief and shippable. Particles read slightly blocky/square rather than glowing sparks, and the depth composition feels a touch flat, keeping it just shy…

Claude Sonnet 5 3.0 · Claude Fable 5 2.5 (+0.5)

What I saw: UI overlays (title, kills, weather, HP bar, hint) render correctly but the 3D scene is completely black — no terrain, trees, player, or enemies visible, meaning the WebGL world failed to render despite solid source code. A non-rendering core makes this effectively broken for a 3D RPG task.

Game Game
Claude Sonnet 5 8.2 · Claude Fable 5 7.8 (+0.4)

What I saw: Strong, polished 3D Three.js build with clean neon aesthetic, glowing player orb, colorful octahedron collectibles, spinning obstacles, shadows, starfield, and full HUD/lives/timer loop — clearly shippable. Falls just short of the field's best due to being a fairly familiar colle…

Strengths & weaknesses I logged

Claude Fable 5

Strengths

  • Best solo Anthropic model on this bench — 7.72 avg beats Opus 4.8 (7.49) and Sonnet 5 (7.18)
  • Wins most head-to-heads vs every solo rival: beats Opus 4.8 on 26/42 tasks and GLM-5.2 on 27/42 — two one-shot crashes, not the craft, cost it the average
  • 10 task-winner tags in a single one-shot run — shader/GPU physics is its superpower (Cornell-box path tracer 8.7, black-hole lensing 8.7, synthwave outrun 8.7)
  • Tops external SWE-bench Verified at 95.0% in Julian's three-dragons writeup

Trade-offs

  • Two one-shot black-screens (crypt, twilightvale) from three.js r128 API drift — called THREE.Geometry / CapsuleGeometry, which the pinned CDN doesn't have
  • Free GLM-5.2 still edges it on creative one-shots (7.77 vs 7.72) at $0 — the $10/$50 premium buys agentic depth, not one-shot visuals

Claude Sonnet 5

Strengths

  • 82.1% SWE-bench Verified — first model past 80% on real GitHub-issue repair
  • Dev Team multi-agent mode + 1M context for repo-level agentic work
  • Precision on hard logic — won the raycaster the open-weight field kept botching

Trade-offs

  • One-shot creative-visual builds trail GLM 5.2 here (lost 4 of 5) — no iteration to catch its own bugs
  • A temporal-dead-zone bug blanked its N-body orbit sim on the first shot

Pricing & context — the spec sheet

Spec Claude Fable 5 Claude Sonnet 5
VendorAnthropicAnthropic
Context window200,000 tokens (1M with extended thinking)1,000,000 tokens
Price$10 / $50 per M tokens$3 / $15 per M ($2/$10 intro)
Pricing detailReleased alongside Mythos 5 on June 9, 2026 as the publicly-available member of the new Mythos class. Premium per-token pricing on the Anthropic API; available everywhere Opus 4.8 ships.$3.00 input / $15.00 output per million tokens; introductory $2.00/$10.00 through 2026-08-31.
Release2026-06-092026-06-30
Bench coverage42/42 scored · avg 7.72/1042/42 scored · avg 7.18/10

The verdict — which should you pick?

Across 42 scored shared tasks, Claude Fable 5 averaged 7.72/10, beating Claude Sonnet 5's 7.18/10 by 0.54 points. Pick Claude Fable 5 when the build has to ship on the first prompt and you can afford the trade-offs in the comparison below.

If you only run one of these inside your stack, the head-to-head average above is the call. If you can run both, my honest play is to wire Claude Fable 5 and Claude Sonnet 5 both into the Agent Operating System and dispatch each from the kanban by task type — mission-critical one-shot builds where you want anthropic's newest reasoning → Claude Fable 5, agentic software engineering — write / run / test / fix loops on real repos → Claude Sonnet 5. That's the same setup I run for the 3,600+ founders inside the AI Profit Boardroom.

FAQ — Claude Fable 5 vs Claude Sonnet 5

Which is better, Claude Fable 5 or Claude Sonnet 5?

On Goldie Bench, Claude Fable 5 averages 7.72/10 across the shared tasks, with 3 gold, 1 silver, 7 bronze overall. Claude Sonnet 5 averages 7.18/10, with 1 gold, 5 silver, 2 bronze. Claude Fable 5 wins the head-to-head 22–15.

How much does Claude Fable 5 cost vs Claude Sonnet 5?

Claude Fable 5: Released alongside Mythos 5 on June 9, 2026 as the publicly-available member of the new Mythos class. Premium per-token pricing on the Anthropic API; available everywhere Opus 4.8 ships. Claude Sonnet 5: $3.00 input / $15.00 output per million tokens; introductory $2.00/$10.00 through 2026-08-31.

What's the context window for Claude Fable 5 vs Claude Sonnet 5?

Claude Fable 5 has a 200,000 tokens (1M with extended thinking) context window. Claude Sonnet 5 has a 1,000,000 tokens context window.

When should I pick Claude Fable 5 over Claude Sonnet 5?

Pick Claude Fable 5 for: Mission-critical one-shot builds where you want Anthropic's newest reasoning; Long-context work using extended thinking up to 1M tokens; Plan-heavy multi-step tasks where intelligence in the plan matters more than the build. The trade-off is the weaknesses we logged on the bench: Two one-shot black-screens (crypt, twilightvale) from three.js r128 API drift — called THREE.Geometry / CapsuleGeometry, which the pinned CDN doesn't have; Free GLM-5.2 still edges it on creative one-shots (7.77 vs 7.72) at $0 — the $10/$50 premium buys agentic depth, not one-shot visuals.

When should I pick Claude Sonnet 5 over Claude Fable 5?

Pick Claude Sonnet 5 for: Agentic software engineering — write / run / test / fix loops on real repos; Repo-level reasoning across a 1M-token context (Dev Team multi-agent mode); Precise logic — raycasters, physics — where one-shot open models slip. The trade-off is the weaknesses we logged on the bench: One-shot creative-visual builds trail GLM 5.2 here (lost 4 of 5) — no iteration to catch its own bugs; A temporal-dead-zone bug blanked its N-body orbit sim on the first shot.

How does Goldie Bench score Claude Fable 5 vs Claude Sonnet 5?

Every demo on this page was built by Julian Goldie inside the Agent Operating System — same fixed prompt for both models, one shot, single HTML file out. Each result gets a 0–10 score on whether it ran, how close it hit the brief, and how good it looked. The highest score on each task gets gold; second gets silver; third gets bronze. See methodology for full provenance.

The same stack Julian uses

Run this stack yourself.

Every demo on this bench was built inside the Agent Operating System — one prompt, one shot, single HTML file out. The Agent OS, the prompts, the templates, the weekly walkthroughs and 3,600+ founders shipping with it every day all live inside the AI Profit Boardroom.

3,600+founders
258documented wins
38countries
$59/momonthly