Concepts·live · auto-updated

SAGE Critic Framework

Every swarm improvement must score ≥7/10 to ship. Prevents garbage from compounding.

The Loop

Four roles, sequential:

●Challenger — generates at least 2 failure scenarios. "What would make this worse?"
●Planner — breaks improvement into verifiable steps with success criteria.
●Solver — executes the plan (modifies files).
●Critic — scores 0-10 against Challenger's failure scenarios. Below 7? Don't ship.

Why Four Roles (Not One)

A single LLM instance writing AND reviewing its own work suffers from confirmation bias. It wants to ship. The Critic role is explicitly adversarial — its job is to find problems, not confirm success.

By assigning different roles to different context windows (same LLM, separate conversations), you get genuine adversarial review without training a second model.

Example

Cycle #35 — travel_rescue error handling:

●Challenger: What if network dies during search? What if search engine rate-limits?
●Planner: Add 90s timeout + fallback rules + errors.log.
●Solver: Wrote the soul updates.
●Critic score: 8.5/10 (coverage complete, but missing metric-based fallback trigger).

Shipped because ≥7.

Stats

48 improvement cycles logged. Average Critic score: 8.2/10. Zero improvements shipped below 7.

●Improvement Cycles — full cycle log
●ERL Heuristics Pool — how lessons compound
●Self-Evolution Agents — who runs the loop

← All entries

SAGE Critic Framework

The Loop

Why Four Roles (Not One)

Example

Stats

Related