Solar
Solar — accurate planetary solar system.
What I asked each model — the Solar prompt
Every model on this page got this exact prompt inside the Agent Operating System: Solar — accurate planetary solar system.
Single HTML file out. No iteration. No examples in the system prompt. Whatever each model produced on the first run is what's on this page. 6 frontier models have attempted it so far: GLM-5.2, Kimi K2.7, Opus 4.8, Kimi K2.7 · Fast, Kimi K2.7 · No-Think, Kimi K2.7 · Quality.
Why this task matters. Solar is a textbook test of sim-class capability — the kind of build that exposes whether a model is doing pattern-matching or actual reasoning. Shipping this cleanly is the floor for what I expect from a frontier model — every model on the leaderboard should at least attempt it.
How each model handled Solar
Ranked by my 0–10 score from the source comparison guides on agentos.guide. Click any to play the actual one-shot HTML the model produced.
What I saw: Three genuinely good space sims. Opus tilts the orbits into real 3D with a bloom-heavy sun and Saturn's rings. GLM's is the most product-like — labelled planets, orbit and label toggles, a clean HUD. Kimi's is a tidy tilted-orbit system with rings and a deep starfield. Opus and GLM are neck-and-neck; Opus takes it on the 3D feel.
What I saw: Three genuinely good space sims. Opus tilts the orbits into real 3D with a bloom-heavy sun and Saturn's rings. GLM's is the most product-like — labelled planets, orbit and label toggles, a clean HUD. Kimi's is a tidy tilted-orbit system with rings and a deep starfield. Opus and GLM are neck-and-neck; Opus takes it on the 3D feel.
What I saw: Three genuinely good space sims. Opus tilts the orbits into real 3D with a bloom-heavy sun and Saturn's rings. GLM's is the most product-like — labelled planets, orbit and label toggles, a clean HUD. Kimi's is a tidy tilted-orbit system with rings and a deep starfield. Opus and GLM are neck-and-neck; Opus takes it on the 3D feel.
Demo on the bench. Not scored yet — play it and form your own opinion.
Demo on the bench. Not scored yet — play it and form your own opinion.
Demo on the bench. Not scored yet — play it and form your own opinion.
The winner on Solar
GLM-5.2 took gold on this task.
What I saw: Three genuinely good space sims. Opus tilts the orbits into real 3D with a bloom-heavy sun and Saturn's rings. GLM's is the most product-like — labelled planets, orbit and label toggles, a clean HUD. Kimi's is a tidy tilted-orbit system with rings and a deep starfield. Opus and GLM are neck-and-neck; Opus takes it on the 3D feel.
See GLM-5.2's full model card: /models/glm.
Every attempt — live, playable
Side by side. Click any tile to run that model's actual one-shot HTML in a new tab.
▶ LIVE
▶ LIVE
▶ LIVE
▶ LIVE
▶ LIVE
▶ LIVEHow I scored Solar — methodology
Three axes, 0–10 each, averaged. Runs: drop the .html in a browser; if it opens to a broken page, it scores zero. Hits the brief: did the model ship the thing the prompt asked for, or a different thing it found easier. Looks good: visual polish, motion, interactivity — where most of the gap between gold and silver lives.
My scores trace back to the source comparison guides on agentos.guide. See the full methodology page for data provenance, including which source guide each cell's score came from.
Related
More sim benchmarks: all tasks in the Sim category · See the best AI model for Solar · Back to the leaderboard
Run this stack yourself.
Every demo on this bench was built inside the Agent Operating System — one prompt, one shot, single HTML file out. The Agent OS, the prompts, the templates, the weekly walkthroughs and 3,600+ founders shipping with it every day all live inside the AI Profit Boardroom.