What's the prompt for the Neoncity test?

Neon City — cyberpunk neon-lit city you drive through. Every model receives this exact prompt, one shot, single HTML file out.

Game

Neoncity

Q: What's the best AI model for Neoncity?

GLM-5.2 — GLM's is the most cinematic — neon towers, a setting sun, Japanese signage and a flight HUD, like a frame from a film. Opus's is a clean canyon of lit skyscrapers racing to a vanishing point. Kimi leaned into the synthwave sun and grid more than the city itself. GLM wins the skyline.

Q: How many AI models attempted Neoncity?

23 models on Goldie Bench have attempted Neoncity: Claude Fable 5, Fugu Ultra, Fugu Ultra 1.1, Fugu Mini, Fusion, Gemini 3.6 Flash, GLM-5.2, GPT-5.6 Sol, Grok, Inkling, Kimi K2.7, Kimi K3, MiniMax M3, Hermes MoA, Opus 4.8, Claude Opus 5, Qwen 3.7, Claude Sonnet 5, Kimi K2.7 · Fast, Kimi K2.7 · No-Think, Kimi K2.7 · Quality, DeepSeek V4 Pro, DeepSeek V4 Flash.

Neon City — cyberpunk neon-lit city you drive through.

CategoryGame

Models tested23

Scored18/23

Avg score7.66/10

WinnerGLM-5.2

What I asked each model — the Neoncity prompt

Every model on this page got this exact prompt inside the Agent Operating System: Neon City — cyberpunk neon-lit city you drive through.

Single HTML file out. No iteration. No examples in the system prompt. Whatever each model produced on the first run is what's on this page. 23 frontier models have attempted it so far: Claude Fable 5, Fugu Ultra, Fugu Ultra 1.1, Fugu Mini, Fusion, Gemini 3.6 Flash, GLM-5.2, GPT-5.6 Sol, Grok, Inkling, Kimi K2.7, Kimi K3, MiniMax M3, Hermes MoA, Opus 4.8, Claude Opus 5, Qwen 3.7, Claude Sonnet 5, Kimi K2.7 · Fast, Kimi K2.7 · No-Think, Kimi K2.7 · Quality, DeepSeek V4 Pro, DeepSeek V4 Flash.

Why this task matters. Neoncity is a textbook test of game-class capability — the kind of build that exposes whether a model is doing pattern-matching or actual reasoning. A model that ships this in one shot is usually safe to wire into your agent loop for harder tasks of the same shape.

How each model handled Neoncity

Ranked by my 0–10 score from the source comparison guides on agentos.guide. Click any to play the actual one-shot HTML the model produced.

Claude Fable 5 Anthropic

• 8.1/10

What I saw: Strong on-brief cyberpunk drive: clean neon road edges, glowing buildings with colored strips, magenta hood reflection and a proper first-person perspective all render well. The near hood reads as a solid black block cutting off the lower third and city density is a bit sparse, keeping it just below the top tier.

▶ Play Claude Fable 5's attempt →

Fugu Ultra Sakana AI

• 8.5/10

What I saw: Ultra v2 — cyberpunk neon-city flythrough. Smoke-test PASS (9.1% pixel diff).

▶ Play Fugu Ultra's attempt →

Fugu Ultra 1.1 Sakana AI

• 2.3/10

What I saw: HUD panels (health/boost/kills/crosshair) render nicely but the entire 3D scene is black — no city, road, car, or enemies visible, indicating the Three.js world failed to render. A non-functional walking/driving sim regardless of the ambitious source.

▶ Play Fugu Ultra 1.1's attempt →

Fugu Mini Sakana AI

• 8.5/10

What I saw: Cyberpunk neon city flythrough. Smoke-test PASS (9.6% pixel diff — strong motion).

▶ Play Fugu Mini's attempt →

Fusion OpenRouter

• 8.5/10

What I saw: Cyberpunk neon-city flythrough on three.js with WebGL + UnrealBloomPass. Towers, light trails, foggy depth. Drag to look around.

▶ Play Fusion's attempt →

Gemini 3.6 Flash Google

• 6.3/10

What I saw: Polished cyberpunk HUD (hull, nitro, kill counter, crosshair) and clean neon vignette look strong, but the rendered scene is mostly a dark flat plane with faint magenta glows and sparse enemy hovers — it reads more as an empty arena than a vibrant neon city you drive through, and the visible environment is thin compared to the field's best.

▶ Play Gemini 3.6 Flash's attempt →

GLM-5.2 Zhipu / Z.ai

🥇 9.0/10 · winner · cinematic

What I saw: GLM's is the most cinematic — neon towers, a setting sun, Japanese signage and a flight HUD, like a frame from a film. Opus's is a clean canyon of lit skyscrapers racing to a vanishing point. Kimi leaned into the synthwave sun and grid more than the city itself. GLM wins the skyline.

▶ Play GLM-5.2's attempt →

GPT-5.6 Sol OpenAI

🥉 8.6/10 · immersive neon drive

What I saw: Strong on-brief cyberpunk drive with lit facades in varied neon hues, receding lane markers, street lamps, and a polished HUD/title that sells the midnight-run vibe; only minor weakness is the flat road texture and slightly bare distant horizon, but overall it reads as a genuine city you drive through.

▶ Play GPT-5.6 Sol's attempt →

Grok xAI

• 8.0/10

What I saw: Cyberpunk neon city flythrough with light trails, holograms, fog. Drag to look around. 19KB.

▶ Play Grok's attempt →

Inkling Thinking Machines

• 6.3/10

What I saw: Strong neon typography and clean UI, and it does render a driveable 3D city, but the buildings read as flat matte blocks (emissive materials don't self-glow without bloom) and the road looks like a plain blue plane, giving a flat, unpolished cyberpunk feel well short of the field's best.

▶ Play Inkling's attempt →

The winner on Neoncity

GLM-5.2 took gold on this task. winner · cinematic.

See GLM-5.2's full model card: /models/glm. Direct head-to-head against the runner-up: GLM-5.2 vs Kimi K3.

Every attempt — live, playable

Side by side. Click any tile to run that model's actual one-shot HTML in a new tab.

Neoncity

What I asked each model — the Neoncity prompt

How each model handled Neoncity

The winner on Neoncity

Every attempt — live, playable

How I scored Neoncity — methodology

Related

Run this stack yourself.