Introduction

The problem

Multi-agent systems fail at the seams. One agent hands structured data to the next, and the second agent keeps running even when the shape is subtly wrong: a renamed field, a missing key, or a string where a list was expected. In the vendor-onboarding showcase, the intake agent renames data_access.contains_customer_pii to handles_personal_data and changes compliance.subprocessors from list[str] to a comma-separated string. The downstream security review can still run, but it is now making a decision from incomplete data.

The solution

reagent-flow treats every handoff as a contract and every tool call as a typed boundary. You declare the schema you expect; reagent-flow records what actually flowed and fails the test when they diverge — with an Agent Stack Trace pinpointing the exact field that drifted.

# Security review receives a handoff from the intake agent.
security_session.assert_handoff_matches(schema={
    "vendor_name": str,
    "data_access": {
        "contains_customer_pii": bool,
        "data_categories": [str],
    },
    "compliance": {
        "subprocessors": [str],
        "dpa_required": bool,
    },
})

If the upstream agent renames contains_customer_pii to handles_personal_data, this assertion fails at PR time, not after a risky vendor gets the wrong review.

What this is

CI validation

Fail pull requests when agent handoff payloads or tool outputs drift from the contract you declared.

Local trace assertions

Run in pytest with local trace files. No hosted service or production runtime layer is required.

What this is not

Not runtime guardrails

reagent-flow does not block live traffic, re-ask models, or enforce production policies.

Not semantic verification

It checks declared contracts. It does not prove that every agent decision is correct.

What you get

Handoff contracts

Type-check the data passed between agents, with nested dicts, typed lists, and optional Pydantic support.

Tool output contracts

Validate the shape of every tool’s return value, catching upstream API drift before the downstream agent sees it.

Context preservation

Verify specific values (IDs, versions, user refs) survive multi-hop handoffs unchanged.

Flow assertions

Guarantee the tool-calling sequence you expect — order, repetition, forbidden calls.

Golden baseline diffs

Snapshot-test known-good traces and detect behavioral regressions from prompt tweaks.

Agent Stack Traces

Every failed assertion attaches a readable dump of the full tool-calling history.

Key concepts

Concept	Description
Session	A context manager that records tool calls for one agent run; may declare a `parent_trace_id` and `handoff_context` to form a link in a multi-agent chain
Handoff context	The structured payload passed from one agent session to the next — the target of contract assertions
Trace	The full sequence of turns captured during a session
Contract	A declared schema (`{field: type}`) validated against a handoff or tool result
Golden baseline	A saved trace used as the expected behavior for future runs
Agent Stack Trace	A readable dump of every turn, attached to assertion failures

Framework support

reagent-flow has a zero-dependency core with thin adapters for the major agent frameworks:

OpenAI
Anthropic
LangChain
LangGraph
CrewAI

Get started

Install reagent-flow and write your first contract test in under 5 minutes.

Why not guardrails?

See how reagent-flow fits alongside structured outputs, guardrails, evals, and observability.

Getting Started

Core Concepts

Assertions

Framework Adapters

Advanced

Examples

The problem

The solution

What this is

CI validation

Local trace assertions

What this is not

Not runtime guardrails

Not semantic verification

What you get

Handoff contracts

Tool output contracts

Context preservation

Flow assertions

Golden baseline diffs

Agent Stack Traces

Key concepts

Framework support

Get started

Why not guardrails?

Getting Started

Core Concepts

Assertions

Framework Adapters

Advanced

Examples

Documentation Index

​The problem

​The solution

​What this is

CI validation

Local trace assertions

​What this is not

Not runtime guardrails

Not semantic verification

​What you get

Handoff contracts

Tool output contracts

Context preservation

Flow assertions

Golden baseline diffs

Agent Stack Traces

​Key concepts

​Framework support

Get started

Why not guardrails?

The problem

The solution

What this is

What this is not

What you get

Key concepts

Framework support