Skip to main content
QELautonomous agentlocal-first

How QEL works: AI that proves its own code

Bodega One6 min read

Quick answer

QEL (Quality Enforcement Layer) is Bodega One's 3-level verification system that runs inside the agentic loop. Level 1 checks every file write, Level 2 runs a real compile every two writes, and Level 3 runs the full test suite before handoff. The agent proves its code compiles and passes tests before you ever see it.

Most AI coding tools generate and hope. They write code, hand it back, and trust that you'll catch what went wrong. That works fine for autocomplete. It's not acceptable for an autonomous agent running unsupervised across your codebase.

Bodega One's Quality Enforcement Layer (QEL) is the system that makes the agent responsible for its own work. Not responsible for trying, responsible for delivering something that compiles and passes your tests. Here's exactly how it runs.

What QEL actually is

QEL isn't a test suite you run manually after the agent finishes. It's a 3-level verification system built into the agentic loop itself. It runs on every write, not just at the end. By the time a result reaches you, the agent has already checked its own work at three distinct checkpoints.

The problem it solves

An autonomous agent doesn't answer one question. It reads files, writes code, runs commands, and makes decisions across dozens of steps. Each step introduces new surface area for mistakes: a missing import, a function that compiles but does the wrong thing, a partial edit that breaks something two files away.

Without verification at each step, the agent finishes and hands you something that looks done. QEL is what closes that gap.

How it works: two stages, three verification levels

Stage 1: Contract Extraction

Before the agent writes anything, your prompt is parsed into machine-checkable deliverables: expected files, structural patterns, and framework constraints. No LLM call. Runs in under 5ms. The output is a typed contract the rest of the pipeline uses to evaluate every write.

Stage 2: Iterative Tool Use

The agent works through read, write, and execute cycles. The loop has full visibility into what's done and what's still missing from the contract. When the agent drifts, real-time nudges redirect it toward the actual deliverables rather than letting it spiral.

Inside this loop, three verification levels run continuously.

Level 1: Incremental Verification (every write)

After every file write, a lightweight pattern and compile check runs against the contract. Broken imports, missing exports, structural mismatches caught mid-loop, while the agent can still fix them. Each write gets a confidence score. Writes below threshold (score < 70) get flagged immediately, before the next step starts.

Level 2: Micro-Proof Gates (every second write)

Every two writes, a real compile command runs: tsc --noEmit for TypeScript, py_compile for Python. 10-second timeout. This catches errors that only surface when multiple files interact, the kind of bug a per-file check misses. If the gate fails, the loop pauses before writing more code.

Level 3: Full Verification (post-loop)

When the agent believes the task is complete, the full verification suite runs: tsc --noEmit, pytest, py_compile, and the structural verifier against the original contract. Pass thresholds are 80 for new file creation and 50 for modifications.

If a gate fails, Targeted Repair kicks in: specific instructions per file, per line, describing exactly what's missing or broken. The agent patches the exact problem. The gate reruns. This is not "try again." It's a diagnostic with a fix.

What this means in practice

The mistake that would have shown up as a build error in your terminal gets caught at Level 1 or Level 2, before the agent writes another file on top of it. The deeper integration bug that only appears when components interact gets caught at Level 2. By the time Level 3 runs, you're verifying against a compiler and a test suite, not hoping.

Most AI coding tools don't have this. They generate well. They don't verify. QEL is the difference between an agent that ships code and an agent that proves its code.


Questions about QEL or how it behaves on your specific stack? Come find us on Discord. If you're picking a local model to run with the agent, our Ollama setup guide is a good starting point.

Common questions

What is the Quality Enforcement Layer in Bodega One?
QEL is a 3-level verification system built into the agentic loop that checks every code write. Level 1 runs lightweight pattern checks on each file, Level 2 runs a real compile every two writes, and Level 3 runs the full test suite before handoff. The agent proves its code compiles and passes tests before delivery.
How does Bodega One verify AI-generated code before showing it to me?
QEL uses Contract Extraction to parse your prompt into machine-checkable deliverables, then runs Iterative Tool Use with three verification levels. Each write gets flagged if it scores below threshold, real compiles catch multi-file errors, and full verification with pass thresholds of 80 for new files and 50 for modifications gates the final handoff.
What happens if QEL finds an error during verification?
If a gate fails at any level, Targeted Repair kicks in. The system generates specific instructions per file and per line describing exactly what is missing or broken. The agent patches the exact problem and the gate reruns. This is not a generic retry, it is a diagnostic with a fix attached.
When does Level 2 verification run?
Level 2 runs every two writes. It executes a real compile command with a 10-second timeout: tsc --noEmit for TypeScript, py_compile for Python. This catches errors that only surface when multiple files interact, the kind of integration bug a per-file check would miss. If the gate fails, the loop pauses before writing more code.

Ready to own your tools?

Beta is live now. Join the waitlist for full launch.