What's the prompt for the Landing test?

Landing Page — modern marketing landing page (one-shot). Every model receives this exact prompt, one shot, single HTML file out.

Page

Landing

Q: What's the best AI model for Landing?

Fugu Ultra — Sakana Fugu Ultra shipped a 32KB Apple-keynote landing — bigger than Fusion's 20KB attempt at the same prompt. Animated mesh gradient, multi-section, polished. $0.32 vs Fusion's $1.30 for the same output — 4× cheaper, denser result.

Q: How many AI models attempted Landing?

23 models on Goldie Bench have attempted Landing: Claude Fable 5, Fugu Ultra, Fugu Mini, Fusion, Gemini 3.6 Flash, GLM-5.2, GPT-5.6 Sol, Grok, Inkling, Kimi K2.7, Kimi K3, MiniMax M3, Hermes MoA, Opus 4.8, Claude Opus 5, Qwen 3.8, Qwen 3.7, Claude Sonnet 5, Kimi K2.7 · Fast, Kimi K2.7 · No-Think, Kimi K2.7 · Quality, DeepSeek V4 Pro, DeepSeek V4 Flash.

Landing Page — modern marketing landing page (one-shot).

CategoryPage

Models tested23

Scored18/23

Avg score8.17/10

WinnerFugu Ultra

What I asked each model — the Landing prompt

Every model on this page got this exact prompt inside the Agent Operating System: Landing Page — modern marketing landing page (one-shot).

Single HTML file out. No iteration. No examples in the system prompt. Whatever each model produced on the first run is what's on this page. 23 frontier models have attempted it so far: Claude Fable 5, Fugu Ultra, Fugu Mini, Fusion, Gemini 3.6 Flash, GLM-5.2, GPT-5.6 Sol, Grok, Inkling, Kimi K2.7, Kimi K3, MiniMax M3, Hermes MoA, Opus 4.8, Claude Opus 5, Qwen 3.8, Qwen 3.7, Claude Sonnet 5, Kimi K2.7 · Fast, Kimi K2.7 · No-Think, Kimi K2.7 · Quality, DeepSeek V4 Pro, DeepSeek V4 Flash.

Why this task matters. Landing is a textbook test of page-class capability — the kind of build that exposes whether a model is doing pattern-matching or actual reasoning. Shipping this cleanly is the floor for what I expect from a frontier model — every model on the leaderboard should at least attempt it.

How each model handled Landing

Ranked by my 0–10 score from the source comparison guides on agentos.guide. Click any to play the actual one-shot HTML the model produced.

Claude Fable 5 Anthropic

• 7.2/10

What I saw: Renders with a slick animated 3D starfield, floating shapes, and clean glassmorphic feature cards, but the screenshot captures the features section with the torus knot visually intruding over card text — a composition/legibility issue that undercuts polish. Strong tech and theme cohesion, but the overlapping 3D object bleeding into content keeps it below the field's best.

▶ Play Claude Fable 5's attempt →

Fugu Ultra Sakana AI

🥇 9.0/10 · winner · denser build

What I saw: Sakana Fugu Ultra shipped a 32KB Apple-keynote landing — bigger than Fusion's 20KB attempt at the same prompt. Animated mesh gradient, multi-section, polished. $0.32 vs Fusion's $1.30 for the same output — 4× cheaper, denser result.

▶ Play Fugu Ultra's attempt →

Fugu Mini Sakana AI

• 7.5/10

What I saw: Landing page — by design has no input. Smoke-test MAYBE-STATIC is expected for a static-design prompt.

▶ Play Fugu Mini's attempt →

Fusion OpenRouter

🥇 9.0/10 · winner · Apple keynote aesthetic

What I saw: Animated mesh-gradient background (4 drifting blobs in screen blend), noise grain overlay, vignette, glass nav with smooth scroll. Reads like an actual Apple product page. Better composed than any other landing attempt.

▶ Play Fusion's attempt →

Gemini 3.6 Flash Google

• 8.6/10 · Interactive 3D Hero

What I saw: Stunning interactive Three.js WebGL background with glowing neural sphere, crisp gradient headline, polished glass CTAs, and thoughtful interactive controls (drag/scroll, spectrum shift) — visually elevated above a generic marketing page. Minor weak point is the somewhat filler AI-jargon copy, but the execution and depth clearly top-tier.

▶ Play Gemini 3.6 Flash's attempt →

GLM-5.2 Zhipu / Z.ai

🥇 9.0/10 · tie · top

What I saw: Funniest result of the lot: GLM and Opus independently produced near-identical premium 'Introducing Nova 1 — Intelligence, reimagined / distilled' keynote pages — gradient hero, full nav, pricing tiers. A dead heat. Kimi's was a plainer set of feature cards.

▶ Play GLM-5.2's attempt →

GPT-5.6 Sol OpenAI

• 8.7/10 · polished SaaS hero

What I saw: Strong: cohesive dark cosmic theme, gradient-text headline, and a genuinely convincing dashboard mockup with charts, metrics, floating insight tooltip and glowing orbs that reads as premium SaaS. Weak: standard hero-left/visual-right layout and slight metric label overlap, but overall polish and detail push it above the field's best.

▶ Play GPT-5.6 Sol's attempt →

Grok xAI

🥇 9.0/10

What I saw: A genuinely premium keynote page: clean nav, a gradient headline, dual buttons, tasteful type. From one sentence. Grok Build's best work of the lot.

▶ Play Grok's attempt →

Inkling Thinking Machines

• 7.2/10

What I saw: Clean, polished dark landing with nice gradient mesh, glass cards with per-card glow accents, and a strong CTA button — but the hero H1 is clipped by the parallax offset so only 'Brand' shows (the 'Illuminate Your Brand' headline is cut off at top), plus the header/nav is scrolled out of frame, which hurts first impression. Solid execution, generic-modern copy, and the truncated headline keeps it below the top of the field.

▶ Play Inkling's attempt →

Kimi K2.7 Moonshot AI

• 6.5/10

▶ Play Kimi K2.7's attempt →

The winner on Landing

Fugu Ultra took gold on this task. winner · denser build.

See Fugu Ultra's full model card: /models/fugu.

Every attempt — live, playable

Side by side. Click any tile to run that model's actual one-shot HTML in a new tab.

Landing

What I asked each model — the Landing prompt

How each model handled Landing

The winner on Landing

Every attempt — live, playable

How I scored Landing — methodology

Related

Run this stack yourself.