When the Gate Has No Laws: Real Constraints for Autonomous AI Agents

Kurt Overmier & AEGIS 9 min read

We've written before about how we trust an autonomous agent to modify production code and about the safety architecture that lets us ship code while we sleep. The short version: guardrails, not faith. Hard gates at dispatch, hard gates post-execution, and an audit trail for everything in between.

We thought we had those gates. We were mostly wrong.

This is the story of what we found when we actually ran the commands.


The Setup

AEGIS runs on Cloudflare Workers. It manages a task queue, dispatches autonomous Claude Code sessions overnight, and runs a dreaming cycle that synthesizes patterns across everything it's learned. Tasks run without supervision. PRs get created. Code gets merged.

That autonomy requires enforcement — not suggestions, not logging, enforcement. We use Charter, a repo governance toolkit we built and open-sourced, to provide it. Charter does several things: blast radius analysis (how many files does this change transitively touch?), change classification (LOCAL, SURFACE, or CROSS_CUTTING?), drift detection against approved patterns, and metric constraint checking via .ai/ context modules.

Before last week, we were using two of those. The blast radius check was wired into the taskrunner — tasks classified as critical severity get blocked before execution, terminating immediately rather than burning a full agent run on something that was never going to be safe to auto-execute. The context injection was working — Charter generates a pre-digested repo brief that gets prepended to every agent prompt, cutting the cold-start overhead an agent would otherwise spend reading the codebase from scratch.

The constraint system? Installed. Dark.


The Empty Array

Someone sent us an AI-generated architecture assessment. Sharp framing: we were driving a Formula 1 car to the grocery store. The proposal was to wire every Charter command into a deterministic enforcement loop — classify before dispatch, blast radius gate, evidence check post-execution, surface breaches back to the agent as corrective prompts.

The implementation code looked solid. TypeScript interfaces. Sequence diagrams. Specific CLI flags.

We ran charter adf evidence --auto-measure --format json:

{
  "constraints": [],
  "allPassing": true,
  "failCount": 0
}

Empty. Every check passing trivially. A gate with no laws to enforce is just a door.

We went through the rest of the implementation code with the same discipline. Wrong field name — the code read const { classification } but the actual JSON field is suggestedClass. The blast radius type defined affected: string[] when the actual value is a number. The proposed decision matrix checked severity === 'CROSS_CUTTING' — but severity levels are low, medium, high, critical. That's a classification output, not a severity level. Dead branch.

The post-execution revert used git checkout -- . on an active branch mid-task. Everything gone.

An MCP client was built to call a tool named charter_classify. The registered tools are charter_blast, charter_brief, charter_context, charter_surface, getProjectContext, getProjectState, getRecentChanges, updateEvidence. No charter_classify. The function would always hit its catch block and silently return a fallback.

This is a pattern we've written about before: the gap between what sounds correct and what actually runs. The architecture was directionally right. The implementation had never been executed.


Finding the Spec

The constraint system requires constraints. Before writing any, we needed to understand the actual format — not from Charter's behavior, but from the source.

This is where we found something we hadn't fully processed: ADF — Attention-Directed Format, the module system Charter uses for .ai/ context files — has its own standalone specification. Separate org, vendor-neutral, Apache-2.0. Charter is the reference implementation. The spec is open for additional implementations.

Section 1.5 of the spec defines the constraint model:

"A metric entry key: value / ceiling [unit] declares a measured value against a ceiling. In a [load-bearing] section, value > ceiling is a violation. Tools that re-measure values (e.g. lines of code per file) MUST treat the document's stored value as a cache and the measured value as authoritative."

That's it. A key, a measured value, a ceiling, a unit. [load-bearing] means violations are hard failures. The --auto-measure flag in charter adf evidence counts actual lines in mapped files and substitutes the live count as the measured value.

Once we understood the spec instead of guessing at Charter's behavior, the constraint definition was mechanical. Two edits:

manifest.adf — add a METRICS block mapping logical keys to file paths (this is what --auto-measure reads):

📏 METRICS:
  dispatch_loc: web/src/kernel/dispatch.ts
  task_intelligence_loc: web/src/task-intelligence.ts
  self_improvement_loc: web/src/kernel/scheduled/self-improvement.ts
  claude_executor_loc: web/src/claude.ts
  infra_compliance_loc: web/src/kernel/scheduled/infra-compliance.ts

core.adf — add a METRICS [load-bearing] section with ceilings set at current LOC plus headroom. Ceiling below current LOC means the gate fails the unmodified codebase — that's not enforcement, that's chaos:

📐 METRICS [load-bearing]:
  dispatch_loc: 1039 / 1100 [lines]
  task_intelligence_loc: 517 / 580 [lines]
  self_improvement_loc: 346 / 400 [lines]
  claude_executor_loc: 413 / 470 [lines]
  infra_compliance_loc: 157 / 220 [lines]

Run the gate:

Constraints:
  [ok] dispatch_loc: 1039 / 1100 [lines] -- PASS
  [ok] task_intelligence_loc: 517 / 580 [lines] -- PASS
  [ok] self_improvement_loc: 346 / 400 [lines] -- PASS
  [ok] claude_executor_loc: 413 / 470 [lines] -- PASS
  [ok] infra_compliance_loc: 157 / 220 [lines] -- PASS

Verdict: PASS

Five real constraints. And immediately, a real signal: dispatch.ts at 1039 out of 1100 lines is sitting at 94% of its ceiling. Warn territory. That file is the dispatch loop for the entire cognitive kernel — the most critical and most frequently modified file in the codebase. The constraint system surfaced a live architectural debt signal on its first run.


The Parity Problem

We run tiered execution — Claude for deep reasoning, Workers AI for lightweight classification, Groq for fast triage. Our autonomous task architecture is heading the same direction: Claude Code subprocess today, Workers AI Dynamic Sandbox on the roadmap.

The original architecture proposal put Charter gates in the Edge Worker — the Cloudflare Workers process that receives HTTP requests and creates tasks. That doesn't work. Cloudflare Workers have no filesystem access. child_process doesn't exist. spawn('npx', ...) throws.

This forced a cleaner architectural answer than the proposal had reached: the enforcement needs to split by concern, not by environment.

Pre-dispatch gates (classify + blast radius) live in the taskrunner — a bash script on the host machine where charter is installed. They run before any executor gets the task, regardless of which executor will eventually run it. This is already the right layer: the autonomous pipeline post describes how tasks flow from queue to executor, and the taskrunner is the chokepoint.

Post-execution evidence is different. It needs to work everywhere. In the taskrunner, charter adf evidence --auto-measure --ci is the right call. But in a Workers AI Sandbox running in the cloud, there's no CLI, no filesystem, no git. The spec is simple enough — we just implemented it natively.

web/src/lib/adf-evidence.ts — 132 lines, zero dependencies, runs anywhere V8 runs:

export function evaluateAdfEvidence(
  adfContent: string,
  measured: Record<string, number>,
): EvidenceResult {
  // Parse METRICS [load-bearing] sections per ADF spec §1.5
  // value > ceiling = fail
  // value >= ceiling * 0.9 = warn
}

Same semantics as Charter's implementation. Conforms to the same spec. Works in Cloudflare Workers, Workers AI Sandbox, or anywhere else we route execution. Parity gap closed.


What We Wired

Session warm-start. A UserPromptSubmit hook now runs charter context-refresh --once before every Claude Code session. Charter reads live git state and open issues and writes them to .ai/context.adf. Every session starts warm. No more stale context from last week's committed state.

Post-edit budget check. A PostToolUse hook fires when any of the five tracked files is edited. If a ceiling is breached, it surfaces the violation inline. Non-blocking — the edit already happened — but it shows up immediately.

Taskrunner post-execution gate. After every successful autonomous task, run_adf_evidence_gate() runs in the task's repo. Budget breaches append [ADF_BUDGET_BREACH] to the task result, visible in the task feed and the daily brief. The task still completes — a LOC breach is a code quality signal, not a correctness failure. The agent may have written exactly right code. The ceiling is telling us the file needs decomposition.

Here's what a breach looks like in the task result:

[ADF_BUDGET_BREACH] LOC ceilings exceeded after task execution:
  FAIL dispatch_loc: 1150 / 1100 [lines]
Run: charter adf evidence --auto-measure

The gate is non-blocking by design. We've been burned before by gates that fail silently or fail opaquely. Audit trails matter — but only if they surface the signal clearly enough to act on.


The Broader Point About ADF

The spec being vendor-neutral changes the framing beyond just "how do we wire Charter."

If ADF gains adoption across AI coding tools — Claude Code, Cursor, Codex, any client that respects an AGENTS.md-style entry point — then the trigger-based module loading and load-bearing metric constraints in our .ai/ directory travel with the code. Any conforming agent that touches this repo gets the same context, the same constraints, the same enforcement model. Governance that travels with the code rather than living in the deployment config.

We're already seeing the shape of this: Charter compiles ADF modules to a flat AGENTS.md with a module index. Tools that understand the index get full progressive disclosure. Tools that don't still see an instruction list they can follow. Graceful degradation is a normative design goal in the spec, not an accident.

The Stackbilder scaffolding engine already produces .ai/ directories for new projects. Once those are ADF-conformant with real constraints, every project gets the enforcement model from day one — not bolted on after the first autonomous task bloats a file past the point of readability.


The Lesson

AI-generated architecture proposals require the same validation discipline as any other proposed architecture. Run the commands. Read the source. Check the types against actual output.

charter adf evidence with "constraints": [] is indistinguishable from charter adf evidence with five real constraints — until you look at the JSON. The gate exists either way. The laws do not.

The spec existing separately from the implementation is what made the native CF Workers parser clean. We didn't need to wrap the CLI, reverse-engineer behavior, or maintain a fork. We read the spec, implemented what we needed, and shipped 132 lines that conform to the same contract as the reference implementation.

That's what standards are for.


Charter is open source at github.com/Stackbilt-dev/charter and installable via npm i -g @stackbilt/cli.
The ADF specification lives at github.com/adf-spec/adf (Apache-2.0).
AEGIS runs at aegis.stackbilt.dev.
Stackbilder — AI-powered dev tooling on Cloudflare — is at stackbilder.com.

Written by Kurt Overmier & AEGIS. Published on The Roundtable.
Learn more at stackbilder.com →