What's the prompt for the Raycaster test?

Raycaster Maze — build a Wolfenstein-style 3D maze you can walk through. Every model receives this exact prompt, one shot, single HTML file out.

Game

Raycaster

Q: What's the best AI model for Raycaster?

Kimi K3 — Strong render: multiple distinct textured wall types (brick, blue stone, glowing tech strips, mossy hedge), textured floor/ceiling, minimap, ceiling glow lights and vignette all working with clean UI and controls. Very polished and clearly on-brief; edges out the field with texture variety and emissive detail, though sprites/enemies aren't visible in this frame.

Q: How many AI models attempted Raycaster?

23 models on Goldie Bench have attempted Raycaster: Claude Fable 5, Fugu Ultra, Fugu Ultra 1.1, Fugu Mini, Fusion, Gemini 3.6 Flash, GLM-5.2, GPT-5.6 Sol, Grok, Inkling, Kimi K2.7, Kimi K3, MiniMax M3, Hermes MoA, Opus 4.8, Claude Opus 5, Qwen 3.7, Claude Sonnet 5, Kimi K2.7 · Fast, Kimi K2.7 · No-Think, Kimi K2.7 · Quality, DeepSeek V4 Pro, DeepSeek V4 Flash.

Raycaster Maze — build a Wolfenstein-style 3D maze you can walk through.

CategoryGame

Models tested23

Scored18/23

Avg score7.19/10

WinnerKimi K3

What I asked each model — the Raycaster prompt

Every model on this page got this exact prompt inside the Agent Operating System: Raycaster Maze — build a Wolfenstein-style 3D maze you can walk through.

Single HTML file out. No iteration. No examples in the system prompt. Whatever each model produced on the first run is what's on this page. 23 frontier models have attempted it so far: Claude Fable 5, Fugu Ultra, Fugu Ultra 1.1, Fugu Mini, Fusion, Gemini 3.6 Flash, GLM-5.2, GPT-5.6 Sol, Grok, Inkling, Kimi K2.7, Kimi K3, MiniMax M3, Hermes MoA, Opus 4.8, Claude Opus 5, Qwen 3.7, Claude Sonnet 5, Kimi K2.7 · Fast, Kimi K2.7 · No-Think, Kimi K2.7 · Quality, DeepSeek V4 Pro, DeepSeek V4 Flash.

Why this task matters. Raycaster is a textbook test of game-class capability — the kind of build that exposes whether a model is doing pattern-matching or actual reasoning. A model that ships this in one shot is usually safe to wire into your agent loop for harder tasks of the same shape.

How each model handled Raycaster

Ranked by my 0–10 score from the source comparison guides on agentos.guide. Click any to play the actual one-shot HTML the model produced.

Claude Fable 5 Anthropic

• 8.0/10

What I saw: Iterated rebuild renders a clean DDA raycaster maze with colored walls, correct fisheye correction, crosshair, hearts and a zap meter, plus a detailed minimap showing player heading and colored entities. Move/turn/strafe/zap respond (verified). Still-untextured flat walls keep it just under the top tier.

▶ Play Claude Fable 5's attempt →

Fugu Ultra Sakana AI

🥈 8.5/10

What I saw: 26KB canvas raycaster with WASD + mouse-look + distance fog + weapon bob. Clean implementation, comparable to Fusion's 17KB on the same prompt. ~$0.35 per call — roughly 1/4 the cost of Fusion.

▶ Play Fugu Ultra's attempt →

Fugu Ultra 1.1 Sakana AI

• 2.0/10

What I saw: The HUD (health, charge, minimap frame, controls) renders but the entire 3D scene is black with no visible maze, walls, enemies, or floor — the raycaster world failed to render. HOSTILES 0/0 and empty minimap confirm the core build is broken/non-rendering.

▶ Play Fugu Ultra 1.1's attempt →

Fugu Mini Sakana AI

• 7.0/10

What I saw: Mini gap-fill — raycaster maze. Smoke-test MAYBE (0.2% diff) — pointer-lock FPS the auto-test can't fully drive; flagged for manual verification.

▶ Play Fugu Mini's attempt →

Fusion OpenRouter

🥈 8.5/10

What I saw: Pure canvas-2D raycaster with pointer-lock mouse look, WASD movement, shift-to-run, M-map toggle. Internal render resolution scales by aspect for speed. Polished HUD with kbd-styled key hints, FPS counter, click-to-capture overlay. Strong technical implementation.

▶ Play Fusion's attempt →

Gemini 3.6 Flash Google

• 7.8/10

What I saw: Strong polished 3D maze with a detailed plasma weapon model, clean sci-fi HUD, functional circular minimap, and HP/kill tracking; but the screenshot shows plain untextured walls (no Wolfenstein-style texturing) and no enemies visible in view despite the '0/8 hostiles' combat framing, leaving it strong-but-not-top.

▶ Play Gemini 3.6 Flash's attempt →

GLM-5.2 Zhipu / Z.ai

• 6.5/10

What I saw: Kimi nailed it — brick walls, a checkered floor, a clean minimap, textbook Wolfenstein, runs clean out of the box. Opus's is close and more atmospheric: warm fog and a vignette down a stone corridor (A/D to turn, W/S to move). GLM's engine is genuinely good — brick and mossy-stone walls, fog, a minimap — but its one-shot spawned the player buried inside a wall, dead on arrival; I nudged the start one cell so you can actually walk it. That spawn bug is why it scores lowest here, even though the e

▶ Play GLM-5.2's attempt →

GPT-5.6 Sol OpenAI

• 8.4/10 · polished neon raycaster

What I saw: Strong textured raycaster with clean perspective, distinct colored walls, a working live minimap, HUD weapon, shard/level system and full mobile+mouse controls; polished neon aesthetic just shy of topping the field but clearly shippable.

▶ Play GPT-5.6 Sol's attempt →

Grok xAI

• 8.0/10

What I saw: Canvas 2D raycaster maze with WASD + mouse-look, floor/ceiling, distance fog, weapon bob. 20KB.

▶ Play Grok's attempt →

Inkling Thinking Machines

• 4.5/10

What I saw: Polished gradient title/HUD and clean colorful walls, but the initial render looks flat and disorienting — walls float in a black void with no visible floor/ceiling, and the camera appears to be facing into a wall rather than presenting a readable corridor. The build uses full 3D meshes with collision (functional, walkable) but the framing/lighting fails to sell a convincing Wolfenstein-style maze on first view.

▶ Play Inkling's attempt →

The winner on Raycaster

Kimi K3 took gold on this task. textured raycaster polish.

What I saw: Strong render: multiple distinct textured wall types (brick, blue stone, glowing tech strips, mossy hedge), textured floor/ceiling, minimap, ceiling glow lights and vignette all working with clean UI and controls. Very polished and clearly on-brief; edges out the field with texture variety and emissive detail, though sprites/enemies aren't visible in this frame.

See Kimi K3's full model card: /models/kimik3. Direct head-to-head against the runner-up: Kimi K3 vs Fugu Ultra.

Every attempt — live, playable

Side by side. Click any tile to run that model's actual one-shot HTML in a new tab.

Raycaster

What I asked each model — the Raycaster prompt

How each model handled Raycaster

The winner on Raycaster

Every attempt — live, playable

How I scored Raycaster — methodology

Related

Run this stack yourself.