Hermes · Mixture of Agents

Hermes MoA

A panel of frontier models, merged by a chair. The model doesn't matter — the system does.

ContextVaries — the sum of the panel models' contexts (Opus 4.8 + GPT-5.5)
PricingPanel + aggregator calls (via OpenRouter)
Tasks tested42
Avg score8.38/10 average
Medals🥇12 🥈8 🥉4
Release2026-06-28
Official siteHermes Agent OS ↗
Official vendor source
Hermes MoA is built by Hermes · Mixture of Agents — see the vendor's own product page, pricing, and docs at Hermes Agent OS.
Visit Hermes Agent OS →

What is Hermes MoA?

Hermes MoA is the Hermes · Mixture of Agents frontier model with a Varies — the sum of the panel models' contexts (Opus 4.8 + GPT-5.5) context window, released 2026-06-28. Tagline: A panel of frontier models, merged by a chair. The model doesn't matter — the system does.. Official source: Hermes Agent OS.

Pricing detail. Hermes Mixture of Agents dispatches one prompt to a configurable panel of frontier models in parallel, then a named aggregator reads every draft and writes one better final answer. Default panel: Claude Opus 4.8 + GPT-5.5, aggregated by Opus 4.8 — all via the OpenRouter key. Unlike a black-box ensemble, every slot is yours to swap from the Mixture tab in the Agent OS.

How I use it inside the Agent OS. Run from the Mixture tab in the Hermes Agent OS. On this bench the panel built each demo and the aggregator merged the best of every draft.

What I built with Hermes MoA

Every model on Goldie Bench gets the same fixed prompt set — one shot, single HTML file out — and I score the result 0–10 inside the Agent Operating System. Here's what Hermes MoA shipped on the bench: 42 one-shot demos across Varies — the sum of the panel models' contexts (Opus 4.8 + GPT-5.5) of context. Of those, 42 are scored against the field with my honest 0–10 from the source guides at agentos.guide.

Strengths

  • On GoldieBench, the MoA panel's galaxy edged solo Opus 4.8 — 8.6 vs 8.5 — with a denser 24k-particle spiral (the system beats the model)
  • Two gold + one silver across its first three one-shot builds (galaxy, fireworks, arcade)
  • Vendor-agnostic — swap any OpenRouter model into a panel or aggregator slot without touching the workflow

Trade-offs

  • Latency is the panel's slowest draft plus the aggregator pass — ~110–140s per single-file build vs a solo model's one call
  • Costs more per task than any single model (every panel slot + the aggregator are separate calls)
  • Only 3 of 42 bench tasks run so far — a representative slice, not the full board

Best for

  • High-stakes single prompts where ensemble quality beats single-model speed
  • Squeezing frontier-plus output from models you already have while Fable 5 / GPT-5.6 are still in preview
  • Production agents that want a configurable panel + vendor-redundancy on every call

Every demo by Hermes MoA

42 live demos, sorted by category. Click any tile to play the actual one-shot result. Verdicts and 0–10 scores are pulled from the source guides where I posted them publicly.

Arcade▶ LIVE
Arcade 🥇
Game
A polished Neon Breakout with HP bricks, multi-type capsule power-ups (WIDE/SLOW/LIFE), level progression with speed-up, screen shake, flash, lighter-blend particles, starfield, perspective grid, best-score persistence, and full mouse/touch/keyboard control with pause/restart — this edges past Opus 4.8 and Fusion (both 8.5) by combining their juice with deeper systems (HP tiers, power-up variety, level waves) in clean single-file code.
Crypt▶ LIVE
Crypt
Game
A clean, self-contained 2.5D raycaster (Wolfenstein-style) crypt with solid mortar/flicker shading, held torch, embers, minimap, and dual touch+keyboard controls — it plays well and is genuinely atmospheric, but it's a flat maze-escape with no enemies or combat, undershooting the true-3D three.js dungeon crawlers from Fusion (9.0) and MiniMax (8.5). Above generic and clearly beats SOLO Opus (6.0), but the lack of skeletons/combat and the simpler rendering keep it below the WebGL field leaders.
Dogfight▶ LIVE
Dogfight 🥈
Game
Polished 2D-canvas dogfight with strong feel — adaptive aim-assist, heat/overheat gun mechanic, dual input (drag+WASD), shake, particles, and a clean HUD that play noticeably better than SOLO Opus (7.5); the catch is it's top-down 2D canvas, not 3D/WebGL like Fusion's 36KB three.js build (9.0) and Grok/MiniMax, so it edges most of the field on craft but doesn't quite out-class the genuine 3D winners.
Doom▶ LIVE
Doom 🥇
Game
A complete, polished raycaster that nails the Doom screenshot framing — corridor with imps dead ahead, detailed canvas-drawn imp sprites with bob/flash/death states, muzzle flash, screen-shake kick, hit-scan with line-of-sight checks, and a clean DOOM-branded HUD with health/ammo/kills. The DDA casting, z-buffered sprite sorting, brick shading, and damage vignette edge it slightly past SOLO Opus 4.8 and Fusion (8.5) on visual cohesion and code quality, though it lacks pointer-lock mouse-look (drag-turn only).
Dragonflight▶ LIVE
Dragonflight 🥈
Game
Strong three.js dragon-flight build with a polished neon HUD (rings/speed/streak/fury), a genuinely articulated multi-segment dragon with flapping wings, additive fire-breath particles, fury mode, and three input paths (pointer/keyboard/touch) — visibly richer and more cohesive than SOLO Opus 4.8's leaner 12KB entry. Edges past Grok/MiniMax/Fugu on dragon detail and effect layering, landing just above Fusion's complete retry as the strongest in the field.
Dragonrealm▶ LIVE
Dragonrealm
Game
Rich frozen-realm build with aurora, ruins, frozen lake, FP sword, snow particles and a flying dragon — more atmospheric detail than SOLO Opus (7.0), but the source is visibly truncated mid-mousemove handler, leaving control logic and the render loop unverified, so it can't be credited as a clean-running top finisher against Fusion/MiniMax (9.0).
Game▶ LIVE
Game
Game
A genuinely juicy single-file arcade build: physics-driven movement with both mouse/touch and WASD, a shockwave mechanic with cooldown ring, combo/lives/level system, localStorage best score, screen shake, particle bursts, parallax stars, scrolling grid, and ambient glow — denser and more polished than SOLO Opus 4.8 (7.0) and edges past the 9.0 Fusion/Grok/Fugu tier only on polish-vs-reactivity, landing just under them. Clean, complete, and clearly winning against Opus.
Neonblaster▶ LIVE
Neonblaster 🥈
Game
Excellent neon shooter with polished synth music (proper scale-based arp + noise drums), boss bullet-hell patterns, screen-shake/flash juice, power-up system, bombs, and robust dual input (pointer + WASD + touch); cleaner and more cohesive than SOLO Opus 4.8 (7.5) and edges out Fusion (9.0) only narrowly — comparable feature depth but slightly less verified breadth, so it lands just below the top of the field.
Neoncity▶ LIVE
Neoncity
Game
A polished pure-canvas (no three.js) cyberpunk night-drive with pseudo-3D projection, neon road edges, lit windows, animated signage, rain, a car hood/HUD speedo and steerable boost — genuinely interactive and atmospheric, edging past Opus 4.8/Fusion's flythroughs on playability but lacking GLM's cinematic sun-and-skyline composition that wins this task.
Neonracer▶ LIVE
Neonracer
Game
A polished pseudo-3D first-person neon dodger with strong synthwave aesthetics (sun, grid, vapor trails, glow), but it's a lane-dodge endless runner rather than the lap-timer/drift-physics/procedural-track top-down racer that Fusion (8.5) and the rest of the field delivered. Clean and shippable, but it interpreted the brief more loosely and lacks the genuine racing/lap mechanics, so it lands just below Fusion and roughly tied with the 8.0 cluster minus a notch for the off-genre take.
Nordiccrypt▶ LIVE
Nordiccrypt
Game
Polished Nordic crypt with strong texture/torch/ember/rune atmosphere, instanced walls, mobile joysticks and pointer-lock — clearly beats SOLO Opus 4.8 (6.0), but the displayed source is cut mid-mousemove handler so the input/movement/collision/animation loop can't be confirmed, and it shows no enemies or boss room. Falls short of Fusion's complete 9.5 and MiniMax/Fugu's verified chasing-enemy builds; scored on visible strength with risk that the loop may be incomplete.
Outrun▶ LIVE
Outrun
Game
Polished pseudo-3D OutRun with curving/cresting road, layered synthwave horizon (sun with scanlines, parallax mountains, skyline, perspective grid), plus gem/traffic gameplay with collision, shake, flash, scoring, and mobile drag — more complete and arcade-rich than Fusion/Grok and on par with Opus 4.8/GLM. Falls just short of clearly beating the 8.5 leaders since the playfield lacks the full arcade dash (RPM/gear dials, named title screen) that GLM/Opus shipped.
Pool▶ LIVE
Pool 🥇
Game
Polished Canvas2D billiards with full 16-ball physics, substepped collision resolution, pocket-suction zones, scratch respotting, auto-break, particle effects and a clean drag-power cue with predictive line — clearly edges Fusion/Grok (8.0) on physics fidelity and presentation, and far ahead of SOLO Opus 4.8 (5.5) which had no input response. Just shy of a flawless top tier (simple normal-impulse collisions, no friend/foe ball rules), but easily the strongest on this task.
Racing▶ LIVE
Racing
Game
Polished three.js arcade racer with winding procedural road, banking/heading-aligned obstacles, boost gates, trees, sparks, glassmorphic HUD and touch controls — visually ahead of SOLO Opus (7.0), but it lacks the lap timer and drift mechanic the top field (Fusion 9.0, MiniMax 9.0, Grok 8.5) delivered, and the source is cut off mid-restart handler so completeness is unverified.
Raycaster▶ LIVE
Raycaster
Game
Polished neon raycaster with recursive-backtracker maze gen, DDA casting, distance fog + edge shading, animated exit beacon, regenerating mazes, full mobile touch joystick, and an auto-tour idle mode — more feature-complete than SOLO Opus 4.8 (8.0) and edges close to Fusion/Kimi (8.5) but the procedural-texture walls look thinner than Kimi's textbook brick textures, keeping it just shy of the top.
Rpg▶ LIVE
Rpg 🥈
Game
Polished top-down RPG with procedural tilemap, collision, wandering/chasing enemies (slime/bat), chests, loot, leveling with HP scaling, potions, particle bursts, floating combat text, full inventory UI, and proper mobile joystick+buttons — denser and more game-feel-complete than Fusion (8.5) and clearly above SOLO Opus (7.0). Edges out the field's polish though Grok's heavier 35KB density keeps it close at the top.
Skyrim▶ LIVE
Skyrim 🥉
Game
Polished single-file three.js Skyrim-lite with vertex-colored terrain, lake basin, instanced pines/rocks, ruins, watchtower, standing-stone ring, custom sky/water shaders, clouds, birds and motes — visually richer than SOLO Opus 4.8 (7.0) and edges out MiniMax (8.5), but the truncated player-control/animation loop leaves it just shy of Fusion's complete 9.0.
Twilightvale▶ LIVE
Twilightvale
Game
Polished single-file three.js RPG with instanced trees/rocks, procedural terrain+river, weather/day-night, enemies with health bars, slash combat, and mobile controls — clearly beats SOLO Opus 4.8 (7.5) and is shippable. But it's lighter on open-world density than the leaders (Grok 9.5, Fusion 9.5, Fugu 9.0), so it strongly ties/slightly beats Opus rather than topping the field.
Voxelcraft▶ LIVE
Voxelcraft
Game
Strong, polished voxel sandbox with procedural textures, day/night cycle with sun/moon/stars, trees, water, and solid mobile touch controls — clearly beats SOLO Opus 4.8 (8.0) and edges near Fusion (9.0), but the collision section is truncated in the source so I can't fully verify physics; build is flagged COMPLETE (ends with </html>) so I keep it above 5.5 but trim slightly for the unverifiable tail.
Landing▶ LIVE
Landing
Page
A polished SaaS-style landing (Orbitly) with animated mesh/conic-spin background, glass nav, a genuinely impressive 3D dashboard mockup with animated chart/floating cards, marquee, feature grid, and CTA form — more functional surface area than the Apple-keynote heroes from Opus 4.8/Fusion/GLM. It's denser and more interactive than the field, but the busier maximalist aesthetic is slightly less restrained/premium than the near-tie keynote pages, so it lands just shy of the 9.0 cluster rather than clearly above it.
Webos▶ LIVE
Webos 🥉
Page
Polished webOS shell with animated starfield wallpaper, topbar+clock, dock and desktop launchers, draggable/resizable/min/max windows with traffic-light controls, autosaving Notes, a DPR-aware resizable Paint with rainbow brush, and a Terminal — the careful pointer-capture and ResizeObserver canvas handling edge it slightly past Fusion/Grok (9.0) in craft, though the source is truncated mid-Paint so full Terminal/Calculator verification isn't possible; rated on the COMPLETE flag.
Blackhole▶ LIVE
Blackhole
Sim
Solid geodesic ray-marcher with real per-step bending, a thin disk crossed via plane-intersection, doppler-ish beaming, photon-ring glow, and polished orbit/zoom/slider controls — cleaner and more complete than Grok/GLM/Qwen, but the disk lensing doesn't visibly fold up-and-over the shadow the way Fusion and Opus 4.8 (both 9.0) achieve, so it lands just shy of the top tier.
Boids▶ LIVE
Boids 🥇
Sim
Spatial-grid boids with clean separation/alignment/cohesion plus predator/beacon pointer modes, scatter burst, live count/speed/vision/separation sliders, and a polished glassmorphic HUD — more interactive and feature-complete than Fugu Ultra (8.5) and well past plain SOLO Opus 4.8 (7.0). The grid optimization, flapping triangle birds, and dual-mode pointer earn it a narrow top spot in this field.
Cloth▶ LIVE
Cloth 🥇
Sim
Richest cloth in the field: full Verlet sim with structural+shear+bend constraints, swappable sphere/box colliders with proper collision response, wind toggle, gust-from-pointer interaction, and grab-on-mesh drag via raycast plane — clearly beats SOLO Opus (3KB) and edges Fusion's solid drape by adding collider shapes and richer lighting/material polish. The tail is cut but the build is flagged COMPLETE (ends in </html>) and the physics/interaction core is fully present, so it earns the top of the pack.
Fluid▶ LIVE
Fluid
Sim
A polished WebGL flow-field with layered vortices, fbm-warped background and additive glowing particles that stir convincingly on drag/tap/Space — visually richer than Opus's clumping particles and well above the generic field, but it's an artistic flow visualizer rather than a real fluid sim like Fusion's stable-fluids build, and it doesn't deliver the convincing 'liquid in a bowl' sloshing that won GLM-5.2 the task. Strong and shippable, just not best-in-field.
Fractal▶ LIVE
Fractal 🥈
Sim
Polished WebGL Mandelbrot+Julia explorer with drag-pan, wheel/pinch zoom, double-tap, autopilot flight to curated seahorse targets, orbit-trap filaments/rings, live coordinate readout, and iteration/palette controls — a more complete feature set than Fusion/Opus 4.8 and rivals Kimi's visual depth thanks to smooth coloring and the trap-based detail. Loses a hair to Kimi only because the auto-flight depth is capped by single-precision (no perturbation), but it clearly beats the rest of the field and edges past SOLO Opus.
Galaxy▶ LIVE
Galaxy
Sim
Solid interactive 3D galaxy: 22k-star 5-arm spiral with per-particle swirl animation, glowing core, bg stars, drag/zoom/pinch and a clever mouse-disturbance field that the field's static-frame entries lack. Falls just short of Fusion/Opus 4.8 (8.5) since it relies on additive blending rather than real UnrealBloomPass and the per-frame JS loop over 22k particles is less polished than Fusion's filmic-tonemapped GPU approach.
Orbit▶ LIVE
Orbit
Sim
A genuinely well-crafted live N-body gravity sandbox — spiral-arm seeding, momentum-zeroed COM, softening, sub-stepping, drag-to-launch and a center-of-mass camera all work, with polished glassmorphic UI and trails that read beautifully. It interprets 'orbit' as emergent chaos rather than the labelled-solar-system brief most of the field nailed, so like GLM's nebula it's gorgeous but slightly off-target versus Opus 4.8's clean planet+NEO+sim-clock take; the superior physics rigor edges it past Fusion and ties the strong mid-pack.
Particleforge▶ LIVE
Particleforge 🥈
Sim
Polished particle sculptor with smooth gravity/swirl physics, mode toggle, bursts, idle drift, and a nice glowing aesthetic with reticle — clearly beats SOLO Opus 4.8's plain build and edges past Grok/Fugu, but it lacks the multiple preset modes (vortex/attractor/repulsor/magnet) and FPS counter that give Fusion (8.5) its breadth, so it lands just shy of the field leader.
Pathtracer▶ LIVE
Pathtracer
Sim
A real progressive Monte-Carlo path tracer with diffuse/metal/glass/emissive materials, Russian-roulette termination, ACES tonemapping and ping-pong accumulation — genuinely on par with the field's WebGL renderers (Fusion/Fugu/MiniMax at 8.5), but the cosine-hemisphere sampling for metal fuzz and the slightly hacky diffuse-reflection path keep it a notch below the cleanest 8.5 builds, and it clearly beats SOLO Opus 4.8's 6KB version.
Reactiondiff▶ LIVE
Reactiondiff 🥇
Sim
A polished, correct Gray-Scott implementation with ping-pong FBOs, drag-to-paint, 5 presets, feed/kill/speed/brush sliders, 4 palettes, and pause/reseed shortcuts — a noticeably richer feature set than SOLO Opus 4.8's bare 5KB version and edging past Fusion/MiniMax via the preset+palette polish. The only nit is the UNSIGNED_BYTE precision (vs float textures) and clamped initial seeding, but patterns evolve cleanly and inputs all respond.
Solar▶ LIVE
Solar 🥉
Sim
Genuinely the most astronomically accurate solar attempt on the bench — real J2000 Keplerian elements with eccentricity/inclination, √AU compression, proper orbit solving, banded gas-giant textures, dual rings, asteroid belt, and a polished glass UI with focus/info cards; edges out Opus 4.8 and GLM on physics rigor but I can't verify the truncated pointer/render loop closes cleanly, so it lands just shy of Fusion/Fugu's confirmed-complete 9.0.
Wormhole▶ LIVE
Wormhole 🥇
Sim
A genuinely polished Three.js wormhole: curved spline tunnel path (not just a straight tube), additive wireframe rings, particles, speed streaks, glow sprites, FOV-warp on boost, and clean pointer-steer + wheel-speed + hold-to-boost controls with HSL color cycling and a vignette. Edges out Fusion and Grok (8.5) on the curving tunnel path and multi-layer depth detail, clearly above SOLO Opus 4.8 (7.5).
Aurora▶ LIVE
Aurora 🥇
Visual
The richest aurora build in the field: layered ribbons with composite-lit gradients, vertical light rays, twinkling stars, a lake reflection (mirrored aurora + ripple shimmer), layered mountain silhouettes, occasional meteors, and smooth pointer-steering with color-shift on click. Clearly beats SOLO Opus 4.8 (6.0, no input) and edges past Fugu Ultra (8.0) on detail and interactivity — the reflection and ray work are the differentiators.
Fireworks▶ LIVE
Fireworks 🥉
Visual
Exceptionally polished single-file fireworks: multiple burst shapes (peony/ring/willow/palm/heart), rocket trails with twinkle physics, parallax skyline + moon + stars, auto/finale/SPACE volley/drag-barrage controls, and DPR-aware canvas. Visually richer than SOLO Opus (7.0) and edges out Grok; only misses Fusion's synthesized audio (whoosh+boom), keeping it just shy of the 9.0 leaders.
Lavalamp▶ LIVE
Lavalamp 🥇
Visual
Polished, complete metaball lava lamp with a real lamp-shaped vessel (caps, rounded glass, vignette), interactive pointer-stir physics that displace blobs, and click-to-shift palette via cosine gradient — clearly more finished and interactive than Opus 4.8's bare 3KB shader and Fusion's simpler warm-gradient blobs. Falls just short of a decisive win since the visual is on-brief but not breathtaking, but it edges the field on interactivity and presentation.
Matrix▶ LIVE
Matrix 🥇
Visual
Goes well beyond the field's classic rain with mouse-bending displacement fields, palette-cycling color schemes, expanding glyph ring bursts, glitch-animated title, scanline/vignette overlays, and pause control — a genuinely richer, polished build that edges out Fusion (8.0) and clearly beats SOLO Opus (7.0). Falls just short of a top tier because the interactive bend/ring gimmicks slightly muddy the iconic pure-rain aesthetic.
Plasma▶ LIVE
Plasma 🥇
Visual
Clean WebGL plasma with 5 cosine palettes, click/drag ripples (px→GL flip done right), keyboard cycling, auto-demo ripples, and a polished glassmorphic UI with vignette — edges out Fusion by combining its palette/ripple feature set with tighter shader work and lower weight, and clearly beats SOLO Opus's barebones 5KB build.
Synthwave▶ LIVE
Synthwave 🥈
Visual
A polished pure-canvas synthwave scene with a proper banded scanline sun, layered mountains, neon perspective grid with hyperdrive boost, bezier palm silhouettes, twinkling parallax stars, and CRT scanline/vignette post — richer and more atmospheric than Opus 4.8/Fusion's three.js flythroughs, and edges ahead on completeness though it doesn't quite top GLM-5.2's single best frame.
Terrain▶ LIVE
Terrain 🥈
Visual
This MoA build breaks from the Tron-grid pack with a naturalistic biome approach — seeded fbm noise, height-based color zones, instanced trees with slope-aware placement, animated water/clouds, and a polished HUD with both auto-pilot and full manual flight. It's clearly more complete and feature-rich than Fusion (8.0) or SOLO Opus (7.0), edging close to Fugu Mini (8.5) but landing just shy of leading since the naturalistic look is less striking than the strong-motion Tron entries that won the field's smoke tests.
Voxel▶ LIVE
Voxel
Visual
Solid, clean voxel-art landscape generator with proper greedy face culling, perlin/fbm island terrain, water, trees, clouds and orbit controls — genuinely well-engineered, but it's a static scenic diorama rather than the interactive Temple-Run runner that Fusion/Fugu/GLM (9.0) and Opus 4.8 (8.5) shipped, so it loses on ambition against the field. Polished and shippable but generic for the task.
Waves▶ LIVE
Waves 🥇
Visual
Solid Gerstner ocean on three.js with real CPU displacement + recomputed normals, plus extras the field mostly lacks: live wave/wind/choppiness sliders, three switchable moods (sky/fog/sun recolor), and an interactive pointer-ripple that raycasts onto the surface. Edges out Opus 4.8 (6KB, no UI) and matches/slightly beats Fusion and the Fugu tier on feature depth, though CPU-side per-vertex waves are less crisp than a GPU vertex-shader ocean.
every demo, in a grid · click any one to play
The same stack Julian uses

Run this stack yourself.

Every demo on this bench was built inside the Agent Operating System — one prompt, one shot, single HTML file out. The Agent OS, the prompts, the templates, the weekly walkthroughs and 3,600+ founders shipping with it every day all live inside the AI Profit Boardroom.

3,600+founders
258documented wins
38countries
$59/momonthly