Hermes · Mixture of Agents

Hermes MoA

A panel of frontier models, merged by a chair. The model doesn't matter — the system does.

ContextVaries — the sum of the panel models' contexts (Opus 4.8 + GPT-5.5)

PricingPanel + aggregator calls (via OpenRouter)

Tasks tested42

Avg score8.38/10 average

Medals🥇12 🥈8 🥉4

Release2026-06-28

Official siteHermes Agent OS ↗

Official vendor source

Hermes MoA is built by Hermes · Mixture of Agents — see the vendor's own product page, pricing, and docs at Hermes Agent OS.

Visit Hermes Agent OS →

What is Hermes MoA?

Hermes MoA is the Hermes · Mixture of Agents frontier model with a Varies — the sum of the panel models' contexts (Opus 4.8 + GPT-5.5) context window, released 2026-06-28. Tagline: A panel of frontier models, merged by a chair. The model doesn't matter — the system does.. Official source: Hermes Agent OS.

Pricing detail. Hermes Mixture of Agents dispatches one prompt to a configurable panel of frontier models in parallel, then a named aggregator reads every draft and writes one better final answer. Default panel: Claude Opus 4.8 + GPT-5.5, aggregated by Opus 4.8 — all via the OpenRouter key. Unlike a black-box ensemble, every slot is yours to swap from the Mixture tab in the Agent OS.

How I use it inside the Agent OS. Run from the Mixture tab in the Hermes Agent OS. On this bench the panel built each demo and the aggregator merged the best of every draft.

What I built with Hermes MoA

Every model on Goldie Bench gets the same fixed prompt set — one shot, single HTML file out — and I score the result 0–10 inside the Agent Operating System. Here's what Hermes MoA shipped on the bench: 42 one-shot demos across Varies — the sum of the panel models' contexts (Opus 4.8 + GPT-5.5) of context. Of those, 42 are scored against the field with my honest 0–10 from the source guides at agentos.guide.

Strengths

On GoldieBench, the MoA panel's galaxy edged solo Opus 4.8 — 8.6 vs 8.5 — with a denser 24k-particle spiral (the system beats the model)
Two gold + one silver across its first three one-shot builds (galaxy, fireworks, arcade)
Vendor-agnostic — swap any OpenRouter model into a panel or aggregator slot without touching the workflow

Trade-offs

Latency is the panel's slowest draft plus the aggregator pass — ~110–140s per single-file build vs a solo model's one call
Costs more per task than any single model (every panel slot + the aggregator are separate calls)
Only 3 of 42 bench tasks run so far — a representative slice, not the full board

Best for

High-stakes single prompts where ensemble quality beats single-model speed
Squeezing frontier-plus output from models you already have while Fable 5 / GPT-5.6 are still in preview
Production agents that want a configurable panel + vendor-redundancy on every call

Every demo by Hermes MoA

42 live demos, sorted by category. Click any tile to play the actual one-shot result. Verdicts and 0–10 scores are pulled from the source guides where I posted them publicly.