InternScience (open-weights, runs local)

Agents-A1

The agent-tuned open MoE — built for long-horizon tool work, running free on your Mac.

Context262,144 tokens

PricingFree · runs locally

Tasks tested45

Avg score4.83/10 average

Medals🥇3 🥈0 🥉0

Release2026-07

Official siteinternscience.github.io/Agents-A1 ↗

Official vendor source

Agents-A1 is built by InternScience (open-weights, runs local) — see the vendor's own product page, pricing, and docs at internscience.github.io/Agents-A1.

Visit internscience.github.io/Agents-A1 →

What is Agents-A1?

Agents-A1 is the InternScience (open-weights, runs local) frontier model with a 262,144 tokens context window, released 2026-07. Tagline: The agent-tuned open MoE — built for long-horizon tool work, running free on your Mac.. Official source: internscience.github.io/Agents-A1.

Pricing detail. InternScience's agent-tuned Qwen3.5-MoE — 35B total with ~3B active per token, trained via three-stage multi-teacher distillation for long-horizon search, engineering and tool calling. The official Q4_K_M GGUF (21GB) runs fully offline on a 36GB Mac via Ollama.

How I use it inside the Agent OS. Benched one-shot on all 42 GoldieBench builds the week it dropped; a candidate brain for local Hermes agent loops.

What I built with Agents-A1

Every model on Goldie Bench gets the same fixed prompt set — one shot, single HTML file out — and I score the result 0–10 inside the Agent Operating System. Here's what Agents-A1 shipped on the bench: 45 one-shot demos across 262,144 tokens of context. Of those, 45 are scored against the field with my honest 0–10 from the source guides at agentos.guide.

Strengths

Agent-tuned: claims SOTA on Seal-0 long-horizon search (56.36), IFBench instruction following (80.61) and BrowseComp in its class (75.51)
35B-class knowledge at ~3B-active speed (MoE) — 256K context, official GGUF on day one
Runs fully local + free on a consumer Mac via Ollama

Trade-offs

Agent tuning is aimed at search/tools/science — one-shot visual builds are not its headline lane
21GB Q4 build wants most of a 36GB Mac to itself

Best for

Local agentic loops + tool calling
Long-horizon research tasks
Free offline agent work

Every demo by Agents-A1

45 live demos, sorted by category. Click any tile to play the actual one-shot result. Verdicts and 0–10 scores are pulled from the source guides where I posted them publicly.

Clean, on-theme menu with good red/dark crypt vibe and functional 3D dungeon logic underneath, but the screenshot only shows the start overlay so the actual torch-lit gameplay isn't visible; the enemy respawn-to-one-spot bug and unremarkable geometry keep it from competing with the best.

Renders a clean, atmospheric icy scene with good title/UI polish and low-poly rocks, but the 'trees' are just tiny floating trunks (leaves misplaced), terrain is flat, and it lacks the dragon/depth and immersion of the top builds. Solid but generic Skyrim-lite.

HUD, health bar, and minimap render, but the main 3D raycast view is completely empty — just flat ceiling/floor colors with no walls or monsters visible, so the core Doom experience is broken. The buggy raycasting math (mixing simplified projection with DDA) clearly fails to render any geometry.

Renders cleanly with a polished neon synthwave aesthetic, a well-modeled glowing car, HUD, and an approaching obstacle — clearly on-brief and functional. But the environment is thin: a flat grid floor with no track walls/edges and only a single lane-less obstacle, making it feel more generic than the field's best 3D tracks.

Renders a genuine 3D scene with a title screen, HUD, compass and controls hint, and the source shows real terrain/tree/rock/ruin generation with first-person controls — but the screenshot is very dark, dimly lit and shows only a couple of trees, reading as underwhelming and generic rather than a polished open-world explorer.

Screenshot shows only the black loading screen with 'LOADING TWILIGHT VALE...' — the game world, HUD, and 3D scene never rendered, so it reads as non-rendering despite reasonable-looking source. The loader either failed to hide or the scene crashed before display.

Renders a clean, on-brief voxel world with proper terrain, trees, hotbar, crosshair, day/night clock and full control scheme — a solid shippable Minecraft-style sandbox. Note a typo (MeshLambantMaterial) in unused code and the visuals are competent but slightly less polished than the field's best, keeping it just below winner tier.

Only the HUD and controls hint rendered — the entire menu screen (title, START button) and Three.js canvas are missing, likely a CDN/init failure, leaving a near-blank black page that never shows the game or menu.

Renders a polished retro HUD with scanlines, HP/shield bars, and a stylish start screen, and the 3D player ship shows real geometry, but the ship overlaps awkwardly with the ENGAGE button/controls text and the title screen looks cluttered rather than clean. Solid three.js foundation but the visual composition and lack of visible combat action keep it middle-of-field.

Renders a polished neon start screen with clear branding, fury meter, score and full control hints, but the screenshot only shows the menu—the dragon and gameplay aren't visible, and the Three.js scene relies on simple primitives that fall short of the field's best. Solid, shippable-looking shell but not a task winner.

Clean, polished neon start screen with a coherent 3D endless-runner (starfield, grid floor, glowing player cube, collisions, HP/score, mobile support) that clearly renders and plays. Solid and shippable but mechanically generic and thin—no depth progression, sound, or standout twist to top the field's 9.0.

Clean, on-brief neon start screen with solid chromatic title, HUD (score/wave/health/enemy count), synth audio engine and full 3D game loop in source, but the screenshot only shows the menu so gameplay juice, screen-shake and bosses can't be visually verified — competent and shippable rather than a task-topper.

Renders but looks broken and near-empty: only one wireframe building visible, a dark blob for the car, and no visible neon grid or dense cityscape — the shared buildingGeo.scale bug compounds sizes and the scene reads as mostly black. UI/HUD is clean but the core neon-city drive experience isn't delivered.

Clean vaporwave grid, glowing neon HUD and title render well with a solid start screen and cyan ship visible. It's polished and on-brief but generic — the ship overlaps the button awkwardly and vapor-trail particles aren't showcased in the static view.

The title 'Nordic Crypt' heading is missing from the overlay and only the intro text and button render — a fatal JS error is evident: the source uses `const torchLight;` and `const arch...` reassignments that throw, meaning the game never initializes and clicking Enter shows nothing. Only a static, largely blank menu displays; the actual 3D dungeon never runs.

The screenshot shows an almost entirely black scene with no visible synthwave sun, grid road, or horizon — just a barely-visible dark car shape and a 'CRASHED!' overlay after 0.6s, indicating the game renders essentially nothing of the promised pseudo-3D road and dies instantly. The HUD/overlay styling is clean but the core visual is broken and off-brief.

Renders a recognizable 3D pool table with racked triangle, cue ball, six pockets, and clean HUD, but the felt looks dim/muddy with odd shadow banding and the shot direction mapping is admittedly hacky, leaving it functional-but-generic rather than a task winner.

Screenshot shows only a start-screen 'ENTER MAZE' button with no visible maze, and the source reveals a confused, likely-broken raycaster (GLSL uniform int mapData never populated from JS, while loop bounds risks, mixed 2D/GPU approach reasoning left in comments). Even behind the gate the actual raycasting is unlikely to render correctly.

The screenshot shows only a start-screen gate ('ECHOES OF THE VOID' + ENTER WORLD), so the judged render is a title card rather than the actual RPG gameplay; the source implies a competent 3D top-down world with HP/XP HUD, inventory, combat, and mobile controls, but none of that is visible on-screen. Clean but unproven against the brief.

Matrixrain 🥇

The screenshot shows only the UI overlay text on a near-black background with no visible rain — the core matrix effect fails to render, likely because thin instanced LineBasicMaterial segments with sway/scale-on-z produce near-invisible geometry against the dark fog. A classic matrix rain should be dense falling glyph columns, not sparse 3D lines, and here even those aren't showing.

Mlx Speedtest 🥇

Renders a polished 3D core with orbiting rings, particle starfield, and neon UI that fits the speedtest theme well, but the results dashboard isn't visible mid-run and it's a simulated (not real) test, leaving it a solid, attractive but generic take.

Renders cleanly with nice neon aesthetic — cyan header, magenta score, perspective 3D grid floor, and a polished Game Over panel — but the screenshot shows the game already in the Game Over state with no visible snake or food, suggesting a startup collision or immediate death bug that undermines the playable experience.

Renders cleanly with a polished 3D icosahedron, particle field, glass cards and solid typography — a strong, shippable hero that clearly delivers on the modern marketing brief. Falls just short of the top tier since it's essentially hero-only with no additional sections and a fairly familiar dark-glassmorphism aesthetic.

Renders a polished 3D starfield desktop with top bar, dock and app icons, and the source shows working Notes/Paint/Terminal windows with dragging and commands—but the 3D primitives feel more like a demo than a desktop, and no window is shown open in the shot, leaving it functional-but-generic rather than a task winner.

The fragment shader calls a non-existent `rotate()` function (never defined) which would cause a GLSL compile error, killing the lensing plane — the screenshot confirms this with no starfield, no accretion disk, and only a barely-visible dark sphere on a black background. Despite reasonable code intent, the actual render is essentially broken and far from the visualisation brief.

The UI panel renders cleanly but the simulation is effectively broken on screen — only a single faint boid is visible, indicating the 300 boids collapsed to a point (cohesion/separation logic clumps them together off-view). Not a functioning flocking display despite decent code structure.

The cloth mesh is broken — the wireframe grid stays a flat detached plane behind the sphere while only a thin crumpled red strip drapes, so it reads as a physics glitch rather than fabric draping over the object. UI and sphere render cleanly but the core cloth sim clearly fails visually.

The UI chrome (title, controls, hint) renders cleanly but the core fluid simulation is completely absent — the canvas is empty black, indicating the particle system failed to render (likely the ShaderMaterial using a custom 'attribute vec3 position' which conflicts with Three.js's built-in position attribute, breaking the shader).

The fractal itself does not render — only the UI overlay and stats show over an empty dark canvas, so the core feature is entirely absent despite FPS reporting 60. The shader likely fails silently (or the Mandelbrot escapes off-screen due to the uv/zoom mapping), making this effectively non-functional.

The galaxy particles fail to render — only the UI overlay is visible against a black background, so the core deliverable is entirely missing. Likely the ShaderMaterial requires vertexColors enabled (color attribute unused) causing no visible points.

Only the UI overlay renders — the 3D scene is entirely black with no bodies, stars, or trails visible, and FPS reads 0, indicating the render loop never ran (source is truncated mid-mouse-handler, likely never calling animate()). Despite reasonable simulation code, the actual output is non-rendering for the core feature.

Renders a clean, colorful 25k-particle field with additive glow, gradient title and polished UI, clearly on-brief and shippable. But the static screenshot shows a scattered cloud with no visible swirling/gravity clustering, so it reads generic rather than a standout forge experience.

Only the UI overlay renders (title, hints, sample counter incrementing to 16) but the actual path-traced canvas shows nothing but a dark quad — no spheres, floor, or lighting visible, and the shader has broken code (undefined sinTheta/PI, first-hit-not-nearest logic). Essentially non-rendering as a renderer despite the chrome working.

The simulation canvas is entirely black — no Turing pattern renders at all, leaving only the title and control chrome; the ping-pong render loop clearly never draws output despite a plausible Gray-Scott shader. A non-rendering build on the core deliverable.

Renders cleanly with a convincing glowing sun, all 8 planets with distinct colors, Saturn's rings, tilted elliptical orbit lines, and a starfield — a polished, on-brief 3D system. Solid and shippable but fairly conventional (approximated scale, basic click-info, no orbit controls library visible), landing just short of the top tier.

The title/UI overlay renders cleanly with a nice glow, but the actual wormhole tunnel is essentially invisible — only scattered star points show. The RingGeometry rings are rotated flat (rotation.x = PI/2) so they face edge-on to the camera and don't form a visible tunnel, badly missing the core 'tunnel flythrough' brief.

The stars and UI title render, but the actual aurora — the entire point of the brief — is invisible; the shader outputs vec4(vUv,1.0,1.0) which produces near-black colors and the aurora system isn't appearing at all. Effectively broken on-brief.

Screenshot shows an empty black canvas with only the UI overlay — no fireworks rendered, and stats read Particles:0/FPS:0, suggesting the animation isn't producing visible particles at capture time. The gradient title is styled nicely but the core interactive display fails to demonstrate any working effect.

Only the UI overlay text renders on a dark background — no blobs are visible, almost certainly because the shader contains an invalid GLSL call (half(p * o)) that fails to compile, killing the render. Effectively non-rendering for the core task.

The title/tagline render nicely, but the actual matrix rain is broken — only a handful of giant blurry, stretched glyph-planes float around instead of dense falling columns of crisp characters. The Three.js approach with too-few drops and no column density completely fails the core brief.

The shader plane renders black — the u_palette uniform is uploaded as raw 0-255 RGB arrays (not normalized vec3, nor proper Vector3/Color instances), so the plasma effect fails and only the UI chrome shows. Controls and palette swatches look clean, but the core hypnotic plasma is entirely missing.

Renders with a nice neon title, glowing sun, and stars, but the composition is broken: the sun sits below the horizon in dead black space, and the grid is a thin compressed strip near the top rather than a proper perspective floor stretching to the horizon. On-brief in vibe but flawed execution well below the field's best.

Renders a lit terrain mesh with clean UI and working stats overlay, but the result is a small, flat, undersized greenish blob with no convincing mountain/valley detail or exploration feel — noise amplitude and camera framing are far too weak to sell a 3D terrain explorer.

Renders a convincing Minecraft-style voxel terrain with proper lighting, shadows, fog and clean UI, but the palette is monochromatic green (little dirt/stone/water visibility), the noise is jittery from Math.random, and it lacks biome variety or trees that would set it apart from stronger entries.

The ocean mesh is not visible at all—only the UI overlay (title, stats showing 16641 vertices, controls hint) renders while the wave surface is invisible, likely due to camera framing/lighting issues despite the code running. A functional-looking build on paper but a broken render in practice.

every demo, in a grid · click any one to play

Compare Agents-A1 against every other model

Every head-to-head featuring Agents-A1. Verdicts shown for scored pairs.

Agents-A1 vs Fusion

Fusion leads 42–0

Agents-A1 vs Hermes MoA

Hermes MoA leads 42–0

Agents-A1 vs Claude Fable 5

Claude Fable 5 leads 41–1

Agents-A1 vs Grok

Grok leads 38–0

Agents-A1 vs MiniMax M3

MiniMax M3 leads 40–2

Agents-A1 vs Fugu Ultra

Fugu Ultra leads 39–3

Agents-A1 vs GLM-5.2

GLM-5.2 leads 41–1

Agents-A1 vs Fugu Mini

Fugu Mini leads 33–3

Agents-A1 vs Opus 4.8

Opus 4.8 leads 34–8

Agents-A1 vs Kimi K2.7

Kimi K2.7 leads 18–2

Agents-A1 vs Claude Sonnet 5

Claude Sonnet 5 leads 35–6

Agents-A1 vs Qwable 5 27B Coder

Qwable 5 27B Coder leads 32–8

Agents-A1 vs Qwen 3.7

Qwen 3.7 leads 32–9

Agents-A1 vs Gemma 4 12B · MLX

Agents-A1 leads 20–18

Agents-A1 vs Laguna XS 2.1

Agents-A1 leads 28–12

Agents-A1 vs Qwythos 9B

Agents-A1 leads 34–5

Agents-A1 vs LongCat-2.0

LongCat-2.0 leads 3–1

Agents-A1 vs Gemma-4 12B Coder

Agents-A1 leads 4–2

Agents-A1 vs Kimi K2.7 · Fast

42 shared tasks · unscored

Agents-A1 vs Kimi K2.7 · No-Think

42 shared tasks · unscored

Agents-A1 vs Kimi K2.7 · Quality

42 shared tasks · unscored

Agents-A1 vs Ornith 1.0

42 shared tasks · unscored

Agents-A1 vs Claude Mythos 5

Agents-A1 vs Kilo Code

See all 66 comparisons across every model →

Quick pill index

Direct comparisons against every other scored model on the bench:

Agents-A1 vs Fusion Agents-A1 vs Hermes MoA Agents-A1 vs Claude Fable 5 Agents-A1 vs Grok Agents-A1 vs MiniMax M3 Agents-A1 vs Fugu Ultra Agents-A1 vs GLM-5.2 Agents-A1 vs Fugu Mini Agents-A1 vs Opus 4.8 Agents-A1 vs Kimi K2.7 Agents-A1 vs Claude Sonnet 5 Agents-A1 vs Qwable 5 27B Coder Agents-A1 vs Qwen 3.7 Agents-A1 vs Gemma 4 12B · MLX Agents-A1 vs Laguna XS 2.1 Agents-A1 vs Qwythos 9B Agents-A1 vs LongCat-2.0 Agents-A1 vs Gemma-4 12B Coder

Read more on agentos.guide:

The same stack Julian uses

Run this stack yourself.

Every demo on this bench was built inside the Agent Operating System — one prompt, one shot, single HTML file out. The Agent OS, the prompts, the templates, the weekly walkthroughs and 3,600+ founders shipping with it every day all live inside the AI Profit Boardroom.

3,600+founders

258documented wins

38countries

$59/momonthly

Join AIPB · $59/mo → Read the Agent OS guides →