Documentation Index
Fetch the complete documentation index at: https://reagent-ai.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
The problem
Multi-agent systems fail at the seams. One agent hands structured data to the next, and the second agent keeps running even when the shape is subtly wrong: a renamed field, a missing key, or a string where a list was expected. In the vendor-onboarding showcase, the intake agent renamesdata_access.contains_customer_pii to handles_personal_data and changes compliance.subprocessors from list[str] to a comma-separated string. The downstream security review can still run, but it is now making a decision from incomplete data.
The solution
reagent-flow treats every handoff as a contract and every tool call as a typed boundary. You declare the schema you expect; reagent-flow records what actually flowed and fails the test when they diverge — with an Agent Stack Trace pinpointing the exact field that drifted.contains_customer_pii to handles_personal_data, this assertion fails at PR time, not after a risky vendor gets the wrong review.
What this is
CI validation
Fail pull requests when agent handoff payloads or tool outputs drift from the contract you declared.
Local trace assertions
Run in pytest with local trace files. No hosted service or production runtime layer is required.
What this is not
Not runtime guardrails
reagent-flow does not block live traffic, re-ask models, or enforce production policies.
Not semantic verification
It checks declared contracts. It does not prove that every agent decision is correct.
What you get
Handoff contracts
Type-check the data passed between agents, with nested dicts, typed lists, and optional Pydantic support.
Tool output contracts
Validate the shape of every tool’s return value, catching upstream API drift before the downstream agent sees it.
Context preservation
Verify specific values (IDs, versions, user refs) survive multi-hop handoffs unchanged.
Flow assertions
Guarantee the tool-calling sequence you expect — order, repetition, forbidden calls.
Golden baseline diffs
Snapshot-test known-good traces and detect behavioral regressions from prompt tweaks.
Agent Stack Traces
Every failed assertion attaches a readable dump of the full tool-calling history.
Key concepts
| Concept | Description |
|---|---|
| Session | A context manager that records tool calls for one agent run; may declare a parent_trace_id and handoff_context to form a link in a multi-agent chain |
| Handoff context | The structured payload passed from one agent session to the next — the target of contract assertions |
| Trace | The full sequence of turns captured during a session |
| Contract | A declared schema ({field: type}) validated against a handoff or tool result |
| Golden baseline | A saved trace used as the expected behavior for future runs |
| Agent Stack Trace | A readable dump of every turn, attached to assertion failures |
Framework support
reagent-flow has a zero-dependency core with thin adapters for the major agent frameworks:- OpenAI
- Anthropic
- LangChain
- LangGraph
- CrewAI
Get started
Install reagent-flow and write your first contract test in under 5 minutes.
Why not guardrails?
See how reagent-flow fits alongside structured outputs, guardrails, evals, and observability.