Skip to main content

Agent Loop Spec

Every AI system has an agent loop — ChatGPT runs a loop that calls tools and streams responses, Copilot runs a loop that reads code and suggests completions, every agent framework implements the same message → model → tool → response cycle. So what's different here?

Nothing. That's the point. The Agent Loop is intentionally generic — a commodity component with no product-specific logic. Its job is to connect a model to tools and stream results back. Think of it as a hand. A hand can grip any tool — hammer, pen, scalpel, phone. It doesn't need to be rebuilt for each one. It doesn't know what it's holding or why. The brain decides what to pick up and what to do with it. The hand executes.

A skilled surgeon and an untrained person have the same hands. The difference is the brain directing them — the knowledge, the training, the judgment. Same here. What's different about any product built on this architecture isn't the Agent Loop — it's what's in Memory. Everything that makes a product unique — the methodology, the personality, the skills, the approval flow, the scope constraints — lives in Your Memory, not in the Agent Loop. The Agent Loop reads Your Memory through tools. The model finds instructions, context, and knowledge in the files. Product behavior emerges from what's in memory, not from custom agent loop code.

Why this matters: a generic agent loop can pick up any tool the ecosystem produces — MCP servers, CLI tools, new integrations — without modification. A product-specific one could only pick up product-shaped things. The moment you bake domain logic into the agent loop, you tie what the system does to how it works, and that's how you get locked in. The analogy: a Toyota engine and a Honda engine work the same way. What makes the car different is everything else — the body, the interior, the features, the driving experience. The Agent Loop is the boring, interchangeable part. That's a feature, not a limitation.

AI tool use collapsed the engineering cost of capabilities from "build across four layers" to "add a tool." The model is now good enough at deciding which tools to use and how to compose them without being told. This means the agent loop itself doesn't need to be smart — it just needs to reliably connect a model to tools and get out of the way.

The Agent Loop sits in the middle of the foundation's flow: clients connect through the Gateway, the Gateway routes to the Agent Loop, the Agent Loop calls models through the Model API and executes tools from the environment. Your Memory is the platform — the Agent Loop reads and writes to it through tools. The Gateway ↔ Agent Loop interface is defined in gateway-engine-contract.md (D137). The Agent Loop receives a messages array (system prompt + conversation history + current message) and metadata via POST /engine/chat, then returns an SSE stream of events (text-delta, tool-call, tool-result, done, error). Auth middleware authenticates requests before they reach the Agent Loop.

Related documents: foundation-spec.md (architecture overview, links to all component specs)


What the Agent Loop Does NOT Do

These are explicit boundaries. They exist to prevent product-specific logic from creeping into the Agent Loop over time.

ResponsibilityWhere It LivesNOT in the Agent Loop
Prompt assemblyThe model reads context files from Your Memory through toolsAgent Loop does not construct product-specific prompts
Skill executionYour Memory (skill files read and followed by the model)Agent Loop does not have a skill framework
Approval flowTools (write tools require confirmation) + model instructionsAgent Loop does not manage approval state
Scope constraintsTool configuration (which tools are available)Agent Loop does not enforce boundaries
Context loadingModel reads files through tools based on its instructionsAgent Loop does not decide what context to load
Conversation persistenceExternal storage (SQLite, files) accessed through tools or configurationAgent Loop does not own conversation storage
Context window managementModel/provider handles compaction, or instructions tell the model how to summarizeAgent Loop does not decide what to cut
Error recoveryAgent Loop reports failures; the model or caller decides what to doAgent Loop does not retry or recover
AuthenticationAuth layer sits in front of the Agent LoopAgent Loop does not authenticate requests
Product personalitySystem prompt and memory filesAgent Loop has no personality

What the Agent Loop Does

The Agent Loop runs a loop:

  1. Accept a message — from the Gateway (which routes requests from clients, API calls, scheduled triggers, or other agents)
  2. Send it to a model — along with a system prompt, tool definitions, and conversation history, through the Model API
  3. Execute tool calls — whatever the model decides to do, through the tool protocol
  4. Stream the response back — to the Gateway
  5. Repeat — until the model signals it's done

That's the complete behavior. Five steps.


What the Agent Loop Guarantees

  • Messages sent to the Agent Loop reach the model
  • Tool calls the model makes get executed
  • Responses stream back to the caller
  • The loop continues until the model signals completion. The Agent Loop MUST NOT impose a default iteration cap. Implementations MAY configure a safety bound as a deployment choice, but the Agent Loop itself never terminates the loop before the model signals done.
  • When the model emits text and tool calls in the same turn, the Agent Loop preserves the text in the assistant message for the next loop iteration. Streamed text is never silently discarded on tool continuation.
  • The model can dispatch multiple tool calls in parallel within a single loop — the Agent Loop executes them concurrently and returns all results

Concurrency

The Agent Loop is one brain with two hands — one model that can dispatch multiple tool executions in parallel. The model coordinates because it initiated all of them. It knows the left hand is reorganizing the filing cabinet, so it tells the right hand to wait before pulling from the same drawer.

This is the only concurrency the Agent Loop needs to handle: parallel tool execution within a single loop. The model calls several tools at once (read a file while a background task runs), the Agent Loop executes them concurrently, results come back, the model decides what's next.

Multi-actor concurrency is not an Agent Loop concern. When a collaborator or external agent accesses the same Memory, they bring their own system — their own agent loop, their own model calls. Two people don't share a brain. They each have their own brain and access the same filing cabinet through Auth. Memory is inert — it doesn't care who's reading it. Auth gates every request. Tools handle write conflicts. No shared agent loop state needed.

ScenarioWhat's happeningAgent Loop's role
Owner dispatches parallel tool callsOne model, multiple tool executionsExecute them concurrently, return results
Background task runs while owner chatsOne model orchestrating both via toolsSame — parallel tool execution within one loop
Owner runs multiple conversations simultaneouslySeparate requests, separate loops, shared Memory via toolsEach request gets its own loop — Agent Loop is stateless between loops
Collaborator accesses Memory simultaneouslySeparate system, separate agent loop, same MemoryNot this agent loop's concern — Auth + tools handle it
External agent connectsSeparate system, separate agent loop, same MemoryNot this agent loop's concern — Auth + tools handle it

The Agent Loop Contract

Per-Request Input (via Gateway)

FieldDescriptionRequired
MessageThe owner's message (or system trigger)Yes
Bootstrap messageMinimal startup instructions for the model (e.g., "Read AGENT.md for your instructions"). The model constructs its working context by reading Memory through tools.Yes
Conversation historyPrior messages in this conversationNo (first message has none)

Boot-Time Configuration (via runtime config)

FieldDescriptionSource
Tool definitionsWhat tools are availablePre-configured from tool sources (D143 — does not change per-request)
Provider configurationWhich model, which provider, API keyRuntime config + adapter (D143 — does not change per-request)

See configuration-spec.md for the full boot sequence (D143).

Output

FieldDescription
Streamed responseText generated by the model, delivered as it's produced
Tool call resultsResults of any tools the model called during the loop
Completion signalIndication that the model is done

Error output

ConditionAgent Loop behavior
Model unreachableReport failure to caller
Tool execution failsReport failure to model (model decides next step)
Provider timeoutReport failure to caller
Invalid inputReject with error

Decisions Made

#DecisionRationale
D39The Engine is a generic agent loop with no product-specific logicComposability. A generic engine can pick up any tool the ecosystem produces. A domain-specific engine can only pick up domain-shaped things. Lock-in comes from tying what you do to how the engine works.
D40Prompt assembly and skill execution live in Your Memory, not the EngineThe model reads skills and context through tools. Product behavior emerges from what's in the files, not from custom engine code. Same principle as D39 — don't tie intelligence to the loop.
D41The "harness" concept is replaced by Engine + the other components working togetherThe harness was an abstraction for something that's actually just Your Memory + Engine + Tools + Models. The agent isn't a component — it's what emerges when you connect the pieces.
D42Engine renamed from "Harness" — "Engine" reflects that it's a commodity componentEvery engine does the same thing. What makes the car different is everything around it. The Engine is the boring, interchangeable part.
D50Bootstrap prompt is Engine configuration — minimal seed that lets the model self-bootstrap from Your MemoryOne line: "Read the entry point in the current folder." Everything else the model discovers from Your Memory by following that instruction.
D108Provider API must remain a connector, not a toolThe model decides what tools to call — you can't use a tool to call the thing that decides which tools to use. Circular dependency. The Provider API is structurally different from tools.
D137Gateway ↔ Engine is a plain HTTP API contractPOST /engine/chat with messages array + metadata. Engine returns SSE stream. Auth middleware on path. See gateway-engine-contract.md.

Open Questions

None. The Agent Loop spec is intentionally complete as-is. If a question arises about behavior, the answer is almost certainly "that's not the Agent Loop's job — it belongs in Memory, Tools, or configuration."


Success Criteria

  • Agent Loop accepts messages from any source and streams responses
  • Agent Loop executes tool calls the model makes without knowing what the tools do
  • Agent Loop works with any model through the Model API
  • Agent Loop has zero product-specific code
  • Agent Loop can be replaced with a different agent loop without changing any other component
  • Swapping agent loops requires only changing the Agent Loop — Memory, Tools, Client, Auth, and Models are unaffected

Security Requirements

Per-component requirements from security-spec.md. Security-spec owns the "why" (D131); this section owns the "what" for the Agent Loop.

  • The Agent Loop must never store credentials, API keys, or tokens in its own state
  • The Agent Loop must not persist data between loops — each loop starts clean (the model reconstitutes from Memory)
  • Tool call results must be passed to the model without modification — the Agent Loop must not inject, filter, or alter tool results
  • The Agent Loop must report tool execution failures to the model, not silently retry or recover
  • The Agent Loop must enforce configured timeouts on tool calls — a slow or hung tool cannot block the loop indefinitely

Changelog

DateChangeSource
2026-03-01"No users, only owners" language pass: user → ownerOwnership model alignment (Dave W + Claude)
2026-03-01Codex cross-reference audit fix: Boot-time configuration table cited D137 (Gateway↔Engine contract) for tool/provider config — corrected to D143 (configuration spec).Codex audit (Dave W + Claude)
2026-03-01Removed V1 Implementation section (Level 2 product detail — Vercel AI SDK, MCP SDK, build estimates, VoltAgent fallback). "Background gardener" → "Background task" in concurrency table. Added concurrency row for multiple simultaneous conversations (same owner, separate requests, Engine stateless between loops).L1 cleanup (Dave W + Claude)
2026-02-27Reordered sections (why → what → how → reference). Merged "What the Engine Is", "How the Engine Fits", and "Why the Engine Is Thin" into single "How we define the Engine" opener. Collapsed related docs table to single line. Moved "Does NOT Do" up as key conceptual boundary.Spec reorder + trim (Dave W + Claude)
2026-02-27Added Security Requirements section — cross-referenced from security-spec.md per T-219T-219 (Dave W + Claude)
2026-02-23Initial Engine spec created from interviewEngine interview session (Dave W + Claude)
2026-02-23Consistency pass — added V1 Implementation section (Build Our Own with Vercel AI SDK + MCP TypeScript SDK), aligned selection criteria with foundation-spec.md (added Must-Have Architecture #8/#11, Nice-to-Have #12/#13, Does NOT Need #6), updated stale D30 reference to D39-D42, replaced stale "updates deferred" note with reconciliation confirmationCross-doc consistency audit (Dave W + Claude)
2026-02-25Restructured section order — what it is → how it fits → what it does → why it's thin. Removed Selection Criteria (decision made — Build Our Own, documented in V1 Implementation). Removed Harness rename history (rename complete, all docs updated).Spec cleanup (Dave W + Claude)
2026-03-17Guarantees updated: Agent Loop MUST NOT impose default iteration cap (implementations MAY configure a safety bound). Text preservation on tool continuation added as guarantee.Drift remediation — MVP Build Review (Dave W + Dave J + Claude)

The Agent Loop is intentionally the thinnest spec in the project. The value is in Your Memory, not in the loop that reads it. The Agent Loop's job is to bring the system alive and get out of the way.