MACP End-to-End Flow

Comprehensive walkthrough of how all MACP components work together — from protocol specification through Rust runtime, NestJS control plane, and TypeScript/Python SDKs

Status: Non-normative (explanatory). In case of conflict, the referenced RFCs are authoritative.

References: RFC-MACP-0001 | RFC-MACP-0002 | RFC-MACP-0003 | RFC-MACP-0004 | RFC-MACP-0005 | RFC-MACP-0006 | RFC-MACP-0012

Imagine three AI agents sitting around a virtual table. One is an architect, another a security reviewer, and the third a cost optimizer. They need to agree on a deployment strategy for a critical Q3 release. Each has its own expertise, its own biases, and its own definition of "good enough." Left to their own devices, they might negotiate forever, contradict each other, or worse — two of them might reach one conclusion while the third reaches another.

This is the problem MACP solves. And this document is the story of how it solves it — tracing a coordination request from the moment an agent first introduces itself, through session creation, deliberation, voting, resolution, and all the way to replay and audit. Along the way, we will meet every layer of the system and understand not just what it does, but why it was designed that way.

The Four Layers: Why Architecture Matters

Think of MACP as a layered system built around a single conviction: when autonomous agents need to produce one binding outcome, the rules of engagement cannot be left to convention. They must be enforced.

At the bottom sits a protocol specification — twelve RFCs that define exactly what "coordination" means. Above that, a Rust runtime acts as the impartial referee, enforcing every rule the spec defines. The control plane adds orchestration and observability, turning raw protocol events into something a human operator can watch in real-time. And at the top, TypeScript and Python SDKs give agent developers a typed, ergonomic interface so they never have to think about envelope serialization or gRPC plumbing.

flowchart TB
    subgraph Clients["Agent Layer"]
        TS["TypeScript SDK"]
        PY["Python SDK"]
        UI["UI / API Consumer"]
    end

    subgraph CP["Control Plane — NestJS"]
        API["REST API + SSE"]
        Executor["Run Executor"]
        Normalizer["Event Normalizer"]
        Projection["Projection Engine"]
        DB[(PostgreSQL)]
    end

    subgraph RT["MACP Runtime — Rust"]
        Kernel["Coordination Kernel"]
        Modes["Mode Registry"]
        Policy["Policy Evaluator"]
        Storage["Storage Backend\nfile / rocksdb / redis"]
    end

    subgraph Spec["Protocol Specification"]
        RFCs["RFCs 0001–0012"]
        Schemas["Protobuf + JSON Schemas"]
        Registries["Mode · Policy · Error Registries"]
    end

    UI -->|"HTTP / SSE"| API
    API --> Executor
    Executor -->|"gRPC bidirectional stream"| Kernel
    Normalizer --> Projection --> DB
    TS -->|"gRPC"| Kernel
    PY -->|"gRPC"| Kernel
    Kernel --> Modes --> Policy --> Storage
    Spec -.->|"defines contracts for"| RT
    Spec -.->|"defines contracts for"| CP
    Spec -.->|"defines contracts for"| TS
    Spec -.->|"defines contracts for"| PY

This layered design also means there are two distinct ways to use the system, depending on how much infrastructure you want in the loop:

SDK-direct — Agents connect straight to the runtime via gRPC and manage their own session lifecycle. This is lightweight, fast, and requires no control plane at all. It is ideal for agent-to-agent coordination where no human needs to watch what is happening.
Control-plane-mediated — A UI or API consumer submits an ExecutionRequest to the control plane, which opens a runtime session on the agents' behalf, sends kickoff messages, streams every event through a normalization pipeline, and builds real-time projections for the UI. This is the path you take when observability, audit, and replay matter.

Both patterns use the same runtime, the same protocol, and the same Protobuf wire format defined in RFC-MACP-0006. The control plane is an addition, not a replacement.

Introducing Themselves: Agent Creation and Registration

Before any coordination can happen, agents need to introduce themselves. In the physical world, you would exchange business cards. In MACP, agents publish a manifest — a structured declaration of who they are, what they can do, and how to reach them.

What goes into a manifest

An agent manifest (RFC-MACP-0005) answers a handful of essential questions: What is your name? What can you do? What coordination modes do you support? What data formats do you speak?

Field	Required	Description
`agent_id`	Yes	Unique identifier
`title`	Yes	Human-readable name
`description`	Yes	What this agent does
`supported_modes`	Yes	Array of mode identifiers the agent can participate in
`input_content_types`	Yes	MIME types the agent accepts
`output_content_types`	Yes	MIME types the agent produces
`transport_endpoints`	No	Array of `{ transport, uri, content_types }`
`metadata`	No	Arbitrary key-value pairs

Think of our architect agent: its manifest might declare support for macp.mode.decision.v1 and macp.mode.proposal.v1, accept application/json input, and produce application/json output. The security reviewer might support the same modes but also list macp.mode.quorum.v1 — because in its world, some decisions require a formal approval threshold.

How agents discover each other

Discovery is the moment agents learn who else is out there. MACP supports four mechanisms, ordered from simplest to most sophisticated:

Well-known URL — https://<host>/.well-known/macp.json returns the manifest as JSON. Simple, cacheable, works everywhere.
GetManifest RPC — Programmatic discovery via gRPC. Pass an empty agent_id to get the serving runtime's own manifest.
ListModes RPC — Returns only standards-track modes (macp.mode.decision.v1, etc.), useful for capability probing.
Registry services — Organizations can index manifests for fleet-wide discovery, letting agents find each other across deployment boundaries.

// TypeScript — discover runtime capabilities
const manifest = await client.getManifest();
console.log(manifest.supportedModes); // ['macp.mode.decision.v1', ...]

const modes = await client.listModes();
// Returns ModeDescriptor[] with name, version, message types, determinism class

# Python — discover runtime capabilities
manifest = client.get_manifest()
print(manifest.supported_modes)

modes = client.list_modes()

With manifests published and discovery complete, our three agents now know about each other. The architect knows the security reviewer can participate in Decision mode. The cost optimizer knows the runtime supports the governance policies it needs. It is time to connect.

Connecting to the Runtime: SDK Initialization

Both SDKs follow a two-layer design that keeps things clean. At the bottom, a low-level MacpClient handles gRPC transport, authentication, and connection management. On top of that, high-level mode sessions (DecisionSession, ProposalSession, and so on) provide a typed, ergonomic API for each coordination mode. You never have to manually construct a Protobuf envelope if you do not want to.

Creating a client

The first step for any agent is creating a client connection to the runtime. Here is what that looks like:

// TypeScript
import { MacpClient, Auth } from '@macp/sdk';

const client = new MacpClient({
  address: '127.0.0.1:50051',
  secure: false,                    // TLS required in production
  auth: Auth.bearer('my-token'),    // or Auth.devAgent('agent-id')
  defaultDeadlineMs: 10_000,
});

# Python
from macp_sdk import MacpClient, AuthConfig

client = MacpClient(
    target="127.0.0.1:50051",
    secure=False,
    auth=AuthConfig.for_bearer("my-token"),  # or AuthConfig.for_dev_agent("agent-id")
    default_timeout=10.0,
)

The handshake: version and capability negotiation

Before anything else happens, the client and runtime perform a handshake — a version and capability negotiation defined in RFC-MACP-0001. This is not just a formality. The handshake establishes which protocol version both sides will speak, which optional features are available, and which coordination modes the runtime has loaded. Without it, neither side can make any assumptions about what the other supports.

sequenceDiagram
    participant Agent as SDK Client
    participant RT as Runtime

    Agent->>RT: Initialize(client_name, client_version, capabilities)
    RT->>Agent: InitializeResult(selected_version, runtime_info, supported_modes, capabilities)

    Note over Agent,RT: Capabilities negotiated:<br/>sessions.stream, cancellation,<br/>progress, manifest, mode_registry,<br/>roots, policy_registry

The runtime responds with its supported protocol version, the list of available modes, and which optional capabilities it supports. The SDK stores these for the lifetime of the client — no need to re-negotiate on every call.

const init = await client.initialize();
// init.runtimeInfo — { name, version }
// init.supportedModes — ModeDescriptor[]
// init.capabilities — { sessions, cancellation, progress, ... }

At this point, our architect agent has a live connection to the runtime. It knows the runtime supports Decision mode v1, that streaming is available, and that the policy registry is loaded. Now the question becomes: who kicks off the coordination? In many real-world scenarios, the answer is the control plane.

Orchestrating the Run: Control Plane Lifecycle

When coordination is mediated through the control plane — as it often is in production deployments where humans need visibility — the process follows a managed run lifecycle with well-defined state transitions. A "run" is the control plane's concept of a single coordination episode, from the moment someone requests it to the moment it completes, fails, or is cancelled.

The run state machine

The state machine is deliberately simple. Runs can only move forward — there is no going back from failed to running, and no way to resurrect a cancelled run. This simplicity is a feature: it makes the system easy to reason about and impossible to put into an inconsistent state.

stateDiagram-v2
    [*] --> queued: POST /runs
    queued --> starting: executor picks up
    starting --> binding_session: runtime session opened
    binding_session --> running: kickoff messages sent
    running --> completed: session resolved
    running --> failed: runtime error / stream lost
    running --> cancelled: POST /runs/:id/cancel
    starting --> failed: runtime unavailable
    binding_session --> failed: session start rejected
    completed --> [*]
    failed --> [*]
    cancelled --> [*]

From request to coordination: the execution flow

Let us follow what happens when a human operator (or an automated pipeline) submits a coordination request through the control plane API. The sequence is precise, and every step has a reason:

sequenceDiagram
    participant UI as API Consumer
    participant API as Control Plane API
    participant Exec as Run Executor
    participant Mgr as Run Manager
    participant RTP as Runtime Provider
    participant RT as MACP Runtime

    UI->>API: POST /runs (ExecutionRequest)
    API->>Exec: launch()
    Exec->>Mgr: createRun() → queued
    Exec->>Mgr: markStarted() → starting
    Exec->>RTP: initialize()
    RTP->>RT: Initialize RPC
    RT-->>RTP: InitializeResult
    Exec->>RTP: openSession()
    RTP->>RT: StreamSession (bidirectional gRPC)
    RT-->>RTP: SessionStart Ack
    Exec->>Mgr: bindSession() → binding_session
    Exec->>RTP: send kickoff messages
    RTP->>RT: Send envelopes
    Exec->>Mgr: markRunning() → running

    loop Event stream
        RT-->>RTP: Accepted envelopes
        RTP-->>Exec: Raw events
        Exec->>Exec: Normalize → Canonical events
        Exec->>Exec: Update projection
    end

    RT-->>RTP: Session resolved
    Exec->>Mgr: markCompleted() → completed
    API-->>UI: SSE stream / GET /runs/:id/state

What goes into an ExecutionRequest

The control plane needs to know everything up front. The ExecutionRequest is a fully resolved specification of what coordination should happen, containing:

mode — Which coordination mode to use (e.g., macp.mode.decision.v1)
runtime — Runtime address and kind (rust)
session — Participant list, TTL, policy version, context
kickoff — Array of initial messages to send after session creation
execution — Mode (live, replay, sandbox), tags, metadata

There is a design philosophy at work here: the control plane never makes assumptions. It does not guess which mode you want or which agents should participate. Everything is declared explicitly, making runs reproducible and auditable.

Persistence: everything gets recorded

The control plane persists everything to PostgreSQL. This is not optional — it is fundamental to the system's ability to provide observability, replay, and audit. Here is what gets stored:

Table	Purpose
`runs`	Run metadata, status, timing, error info
`runtime_sessions`	Bound session metadata, mode, capabilities
`run_events_raw`	Raw runtime events (append-only)
`run_events_canonical`	Normalized events for UI consumption
`run_projections`	Current state cache (built from events)
`run_metrics`	Token usage, cost estimates, event counts
`run_artifacts`	Trace bundles, logs, generated reports

The separation between run_events_raw and run_events_canonical is worth noting. Raw events are preserved exactly as the runtime emitted them — they are the source of truth. Canonical events are a normalized, UI-friendly representation. By keeping both, the system can always re-derive canonical events from raw ones, which matters for replay and debugging.

Opening the Session: Where Coordination Begins

Now we arrive at the heart of the protocol. A coordination session begins with a SessionStart message — and this is the most validated message in the entire system. The reason is simple: everything that follows depends on the session being correctly configured. A bad session start would cascade into invalid state transitions, policy mismatches, and non-deterministic replays. So the runtime checks everything.

The validation gauntlet

When a SessionStart arrives, it passes through twelve validation steps before the session is created. Each step catches a different class of error, and the order matters — cheap checks (authentication, rate limiting) come before expensive ones (mode resolution, policy lookup).

sequenceDiagram
    participant Agent
    participant RT as Runtime

    Agent->>RT: Send(Envelope with SessionStart)
    RT->>RT: 1. Authenticate sender (bearer / mTLS / JWT)
    RT->>RT: 2. Derive sender identity from auth context
    RT->>RT: 3. Rate limit check (60 SessionStart/min default)
    RT->>RT: 4. Validate envelope structure (macp_version, mode, message_type)
    RT->>RT: 5. Validate session_id format (UUID v4/v7, ≥128 bits entropy)
    RT->>RT: 6. Check session_id not already in use
    RT->>RT: 7. Validate SessionStartPayload
    RT->>RT: 8. Resolve mode (must be registered)
    RT->>RT: 9. Resolve policy (policy_version → registry lookup)
    RT->>RT: 10. Create session: OPEN state
    RT->>RT: 11. Append to storage (commit point)
    RT->>RT: 12. Call mode.on_session_start()
    RT->>Agent: Ack(ok=true, session_state=OPEN)

What goes into a SessionStart

The SessionStartPayload declares everything the session needs to function. Some fields are required — you cannot start a session without participants or a TTL. Others are optional but powerful, like binding a governance policy or freezing ambient context.

Field	Required	Description
`participants`	Yes	Non-empty list of declared participant identifiers
`mode_version`	Yes	Semantic version of the mode (immutable for session)
`configuration_version`	Yes	Configuration profile version (immutable for session)
`ttl_ms`	Yes	Session deadline in milliseconds (1 – 86,400,000)
`policy_version`	No	Governance policy identifier; empty resolves to `policy.default`
`context`	No	Frozen context bound at session creation
`roots`	No	Root descriptors for ambient context
`intent`	No	Human-readable session purpose

Version binding: the key to determinism

Here is a design decision that permeates the entire system. Three versions are immutably bound at session creation and cannot change for the session's lifetime:

mode_version — Which semantic profile of the mode to use
configuration_version — Voting/evaluation/acceptance profile
policy_version — Governance rules

Why immutable? Because of deterministic replay. If you replay the same accepted history under the same bound versions, the runtime MUST produce identical state transitions. If versions could change mid-session, replay would be meaningless — you could never be sure whether a different outcome was caused by different agent behavior or different runtime configuration.

Starting our deployment decision

Let us return to our running example. The architect agent decides it is time to coordinate on the Q3 deployment strategy. Here is what that looks like through the SDKs:

// TypeScript — start a Decision session
const session = new DecisionSession(client, {
  modeVersion: '1.0.0',
  configurationVersion: '1.0.0',
  policyVersion: 'policy.majority',
  auth: Auth.bearer('coordinator-token'),
});

await session.start({
  intent: 'Choose deployment strategy for Q3 release',
  participants: ['architect-agent', 'security-agent', 'cost-agent'],
  ttlMs: 120_000, // 2 minutes
});

# Python — start a Decision session
session = DecisionSession(client, policy_version="policy.majority")

session.start(
    intent="Choose deployment strategy for Q3 release",
    participants=["architect-agent", "security-agent", "cost-agent"],
    ttl_ms=120_000,
)

Notice the policy.majority policy version. This tells the runtime to use majority voting rules when the time comes to evaluate a commitment. The architect agent has declared that a simple majority is enough — the cost optimizer does not get veto power. This is governance embedded in the protocol, not left to ad-hoc agent logic.

The session is now OPEN. Our three agents have two minutes to reach a decision.

The Admission Pipeline: Every Message Earns Its Place

With the session open, agents can start sending messages — proposals, evaluations, votes. But not every message gets through. Every single message from a participant passes through a strict admission pipeline before it can enter the session's accepted history. This is where the runtime earns its role as an impartial referee.

The pipeline, step by step

The pipeline is a chain of checks, each one acting as a gate. Fail any gate, and the message is rejected with a structured error. Pass them all, and the message is appended to the session's authoritative log.

flowchart LR
    A["Incoming\nEnvelope"] --> B["AuthN\nbearer / mTLS\nJWT / dev-header"]
    B --> C["Sender\nDerivation"]
    C --> D["Rate\nLimiting"]
    D --> E["Envelope\nValidation"]
    E --> F["Session\nLookup"]
    F --> G["Session\nOPEN?"]
    G --> H["Deduplication\nmessage_id"]
    H --> I["Participant\nCheck"]
    I --> J["Mode\nAuthorization"]
    J --> K["Append to\nLog"]
    K --> L["Mode\nDispatch"]

Two steps in this pipeline deserve special attention.

Authentication: you are who the runtime says you are

The runtime supports multiple authentication mechanisms (RFC-MACP-0004), from bearer tokens for typical production use to mTLS for high-security deployments:

Mechanism	Header	Use Case
Bearer token	`Authorization: Bearer <token>`	Production — tokens issued by control plane
mTLS	TLS client certificate	High-security deployments
JWT / OIDC	`Authorization: Bearer <jwt>`	Federated identity
Dev header	`x-macp-agent-id: <id>`	Local development only

Here is a design choice that matters enormously: the sender field in the Envelope is always overwritten by the runtime from the authenticated identity. Agents cannot self-assert their sender. Period. This single rule eliminates an entire class of impersonation attacks. When the security reviewer sees a proposal from "architect-agent," it knows the runtime verified that identity — not just that someone claimed to be the architect.

Per-token authorization

Each bearer token carries authorization metadata that constrains what the agent can do:

{
  "token": "abc123...",
  "sender": "architect-agent",
  "allowed_modes": ["macp.mode.decision.v1", "macp.mode.task.v1"],
  "can_start_sessions": true,
  "max_open_sessions": 10
}

Rate limiting: preventing runaway agents

Default limits enforced per authenticated sender keep any single agent from overwhelming the system:

Limit	Default
SessionStart messages per minute	60
Session-scoped messages per minute	600
Maximum payload size	1 MB

In a world of autonomous agents, rate limiting is not just about fairness — it is about safety. An agent stuck in a retry loop should not be able to saturate the runtime and starve other sessions.

Mode-specific authorization: not everyone can do everything

Each coordination mode defines precisely who can send which message types. This is not a suggestion — it is enforced by the runtime at the protocol level:

Mode	Message Type	Authorized Sender
Decision	Proposal, Evaluation, Objection, Vote	Any declared participant
Decision	Commitment	Session initiator (default)
Proposal	Proposal, CounterProposal, Accept, Reject	Any participant
Proposal	Withdraw	Author of referenced proposal only
Task	TaskRequest	Session initiator
Task	TaskUpdate, TaskComplete, TaskFail	Active assignee only
Handoff	HandoffOffer, HandoffContext	Current responsibility owner
Handoff	HandoffAccept, HandoffDecline	Target participant of offer
Quorum	Approve, Reject, Abstain	Any eligible declared participant
Quorum	ApprovalRequest, Commitment	Session initiator

Back in our scenario: the cost optimizer cannot unilaterally commit the decision — only the architect (as session initiator) can do that. The security reviewer can propose, evaluate, and vote, but it cannot commit. These rules are not configurable per-session — they are baked into the mode definition. This is by design: mode semantics should be predictable and auditable, not customized into unpredictability.

Mode Invocation: How Coordination Actually Unfolds

Now that messages are flowing through the admission pipeline, it is time to understand what happens when they reach the mode itself. Modes are the semantic heart of MACP — they define the structure of coordination. Is it a vote? A negotiation? A task delegation? A responsibility transfer? The mode decides.

How the runtime dispatches to modes

When an accepted envelope reaches the mode layer, the runtime looks up the mode by name in its registry and calls into it through a well-defined trait interface. The mode can authorize the sender, process the message, update its internal state, and optionally resolve the session.

flowchart LR
    A["Accepted\nEnvelope"] --> B["Mode Registry\nlookup by name"]
    B --> C["mode.authorize_sender\nenvelope, session"]
    C --> D["mode.on_message\nenvelope, session state"]
    D --> E{"ModeResponse"}
    E -->|NoOp| F["No state change"]
    E -->|PersistState| G["Update mode state"]
    E -->|Resolve| H["Session → RESOLVED"]
    E -->|PersistAndResolve| I["Update + Resolve"]

The Mode trait in the Rust runtime is deliberately minimal. Every mode must implement exactly three methods — session start handling, message handling, and sender authorization. This constraint ensures modes are predictable, testable, and composable:

trait Mode: Send + Sync {
    fn on_session_start(&self, session: &Session, env: &Envelope)
        -> Result<ModeResponse, MacpError>;
    fn on_message(&self, session: &Session, env: &Envelope)
        -> Result<ModeResponse, MacpError>;
    fn authorize_sender(&self, session: &Session, env: &Envelope)
        -> Result<(), MacpError>;
}

The five standards-track modes

MACP ships with five coordination modes, each designed for a different interaction pattern. They range from structured group decision-making to simple task delegation:

Mode	Identifier	Participant Model	Determinism
Decision	`macp.mode.decision.v1`	Declared	Semantic-deterministic
Proposal	`macp.mode.proposal.v1`	Peer	Semantic-deterministic
Task	`macp.mode.task.v1`	Orchestrated	Structural-only
Handoff	`macp.mode.handoff.v1`	Delegated	Context-frozen
Quorum	`macp.mode.quorum.v1`	Quorum	Semantic-deterministic

Let us walk through each one, because the differences matter.

Decision Mode: our deployment strategy scenario

This is the mode our three agents are using. Decision mode provides structured choice among proposals with explicit evaluation, objection, and voting phases. It is the most ceremony-heavy mode, but that ceremony exists for a reason — when the stakes are high enough to warrant three agents deliberating, you want a clear audit trail of who proposed what, who evaluated it, and how the vote went.

stateDiagram-v2
    [*] --> Proposing: SessionStart
    Proposing --> Evaluating: Proposal(s) submitted
    Evaluating --> Voting: Evaluation(s) submitted
    Voting --> Committed: Commitment accepted
    Committed --> [*]

    note right of Proposing: Any participant submits proposals
    note right of Evaluating: Participants evaluate with APPROVE/REVIEW/BLOCK/REJECT
    note right of Voting: Participants vote APPROVE/REJECT/ABSTAIN
    note right of Committed: Initiator binds outcome

In our scenario, this is where things get interesting. The architect proposes blue-green deployment. The security reviewer evaluates it — APPROVE, with high confidence. The cost optimizer votes in favor. Here is the code:

// Decision Mode — TypeScript
await session.propose({ proposalId: 'p1', option: 'Blue-green deploy', rationale: 'Zero downtime' });
await session.evaluate({ proposalId: 'p1', recommendation: 'APPROVE', confidence: 0.9 });
await session.vote({ proposalId: 'p1', vote: 'APPROVE' });
await session.commit({ action: 'decision.accepted', authorityScope: 'session', reason: 'Majority approved' });

Proposal Mode: when agents need to negotiate

Sometimes you do not want a formal vote — you want agents to negotiate. Proposal mode supports bounded negotiation with proposals and counterproposals. Think of it as a structured back-and-forth that must eventually converge or terminate.

stateDiagram-v2
    [*] --> Negotiating: SessionStart
    Negotiating --> Negotiating: Proposal / CounterProposal
    Negotiating --> Converged: Accept convergence
    Negotiating --> Rejected: Terminal Reject
    Converged --> Committed: Commitment
    Rejected --> Committed: Commitment
    Committed --> [*]

Task Mode: delegation with accountability

Task mode is the simplest interaction pattern: one agent requests work, another performs it. But even here, the protocol adds value — it tracks the task through acceptance, progress updates, and completion or failure, ensuring the requester always knows the current state.

stateDiagram-v2
    [*] --> Requested: SessionStart + TaskRequest
    Requested --> InProgress: TaskAccept
    Requested --> Unassigned: TaskReject
    InProgress --> Completed: TaskComplete
    InProgress --> Failed: TaskFail
    Completed --> Committed: Commitment
    Failed --> Committed: Commitment
    Committed --> [*]

Handoff Mode: passing the baton

When one agent needs to transfer responsibility to another — along with the context needed to continue — Handoff mode provides a structured transfer protocol. The offering agent can attach context, and the receiving agent explicitly accepts or declines.

stateDiagram-v2
    [*] --> Offered: SessionStart + HandoffOffer
    Offered --> Enriched: HandoffContext (optional)
    Enriched --> Accepted: HandoffAccept
    Enriched --> Declined: HandoffDecline
    Offered --> Accepted: HandoffAccept
    Offered --> Declined: HandoffDecline
    Accepted --> Committed: Commitment
    Declined --> Committed: Commitment
    Committed --> [*]

Quorum Mode: threshold-based approval

Quorum mode is for situations where you need a specific number of approvals from a pool of participants — think code review approvals, compliance sign-offs, or multi-party authorization. The mode tracks votes against a threshold and resolves when the threshold is met or becomes unreachable.

stateDiagram-v2
    [*] --> Voting: SessionStart + ApprovalRequest
    Voting --> Voting: Approve / Reject / Abstain
    Voting --> ThresholdMet: required_approvals reached
    Voting --> ThresholdUnreachable: remaining cannot reach threshold
    ThresholdMet --> Committed: Commitment (positive)
    ThresholdUnreachable --> Committed: Commitment (negative)
    Committed --> [*]

The one rule that unifies all five modes

Across all five modes, there is a single invariant that matters more than any other: only a Commitment message resolves the session. Intermediate outcome messages — TaskComplete, HandoffAccept, Approve — make the session eligible for commitment but do not transition the session to RESOLVED.

This is a deliberate design choice, and it is worth pausing to understand why. In a world of autonomous agents, the protocol needs a single, unambiguous moment when the outcome becomes binding. By separating "the work is done" from "the outcome is committed," MACP gives the initiator (or the policy-designated authority) final say. It also creates a clean audit point: when you see a Commitment in the log, you know the session is over and the outcome is authoritative.

Governance Built In: Policy Application

Our architect agent chose policy.majority when starting the session. But what does that actually mean? How does the runtime enforce governance rules? This is where MACP's policy system comes in — declarative governance rules that constrain how modes operate, defined in RFC-MACP-0012.

A two-phase lifecycle: resolve, then evaluate

Policies have an elegant two-phase lifecycle. In the first phase, at session start, the runtime resolves the policy — looking it up in the registry, validating it, and binding it immutably to the session. In the second phase, when a Commitment message arrives, the runtime evaluates the policy against the session's accumulated history to decide whether the commitment is allowed.

sequenceDiagram
    participant Agent
    participant RT as Runtime
    participant PR as Policy Registry
    participant PE as Policy Evaluator

    Note over Agent,PE: Phase 1: Resolution at SessionStart
    Agent->>RT: SessionStart(policy_version="policy.majority")
    RT->>PR: Lookup "policy.majority"
    PR-->>RT: PolicyDescriptor (rules, schema_version)
    RT->>RT: Bind policy immutably to session

    Note over Agent,PE: Phase 2: Evaluation at Commitment
    Agent->>RT: Commitment(action, reason)
    RT->>PE: Evaluate(policy_rules, accepted_history, participants)
    PE->>PE: Pure function — no I/O, no wall-clock, no randomness
    PE-->>RT: PolicyDecision::Allow
    RT->>Agent: Ack(ok=true, session_state=RESOLVED)

How policy resolution works

The resolution process is straightforward but strict:

Extract policy_version from SessionStartPayload
If empty, resolve to policy.default (no additional constraints)
If non-empty, look up in policy registry
If not found, reject with UNKNOWN_POLICY_VERSION
If mode mismatch (policy targets different mode), reject with INVALID_POLICY_DEFINITION
Store resolved PolicyDescriptor on the session — immutable for its lifetime

Notice step 5: a policy designed for Quorum mode cannot be used in a Decision mode session. This prevents subtle configuration errors where governance rules do not match the coordination semantics.

What policies can control

Each mode exposes different policy knobs. The policy system is not one-size-fits-all — it adapts to the semantics of the mode it governs:

Mode	Policy Controls
Decision	Voting algorithm (majority/supermajority/weighted), quorum requirements, objection veto thresholds, commitment authority
Proposal	Acceptance criteria (all_parties/counterparty/initiator), counter-proposal round limits, terminal rejection
Task	Allow reassignment on reject, require output on completion
Handoff	Implicit accept timeout, commitment authority
Quorum	Threshold override, abstention interpretation (neutral/implicit_reject/ignored)

In our scenario, policy.majority tells the Decision mode that a simple majority of votes is enough for the architect to commit the outcome. If the security reviewer had wanted veto power, a different policy — perhaps one with objection thresholds or supermajority requirements — would have been needed. The point is that these governance decisions are made explicitly at session creation, not implicitly during coordination.

The determinism guarantee

This is perhaps the most important property of the policy system. Policy evaluation is a pure function of:

The resolved policy rules (immutable for the session)
The accumulated accepted message history
The session's declared participants

It MUST NOT depend on wall-clock time, external services, randomness, or any state outside the session boundary. This ensures that policy decisions are identical during replay. If you replay a session and the same history leads to a different policy decision, something is deeply wrong.

The Wire: Message Flow and Results

We have talked about what messages mean. Now let us talk about how they travel. Every MACP message — proposals, votes, commitments, signals — is wrapped in a canonical Envelope (RFC-MACP-0001). The Envelope is the universal container that carries any message type through the system.

The Envelope: one format to carry them all

message Envelope {
  string macp_version = 1;     // Protocol version (e.g., "2026-03-02")
  string mode = 2;             // Empty for ambient signals
  string message_type = 3;     // Discriminator (e.g., "Proposal", "Vote")
  string message_id = 4;       // Unique ID for idempotency
  string session_id = 5;       // Empty for ambient signals
  string sender = 6;           // Authenticated identity (runtime-derived)
  int64 timestamp_unix_ms = 7; // Informational timestamp
  bytes payload = 8;           // Mode-defined content (protobuf-encoded)
}

The design here is worth appreciating. The Envelope separates routing information (session, mode, sender) from content (payload). The message_id enables idempotent delivery — send the same message twice, and the runtime will deduplicate. The sender field, as we discussed, is always runtime-derived from authentication, never self-asserted.

The Send/Ack cycle: truth is authoritative

The primary message pattern is deceptively simple — unary Send followed by Ack — but the semantics are precise:

sequenceDiagram
    participant A as Agent A
    participant RT as Runtime
    participant B as Agent B (streaming)

    A->>RT: Send(Envelope) — unary gRPC
    RT->>RT: Admission pipeline<br/>(auth → validate → dedup → append)
    RT->>A: Ack(ok=true, session_state, accepted_at)

    RT->>B: StreamSession: Accepted Envelope (in order)

    Note over A,B: Ack is authoritative per-message.<br/>StreamSession delivers accepted<br/>envelopes in order to subscribers.

The Ack is the runtime's authoritative verdict on a message. It tells the sender not just whether the message was accepted, but the current session state after processing:

Field	Description
`ok`	Whether the message was accepted
`duplicate`	Whether the message_id was already seen
`message_id`	Reference to the sent message
`session_id`	Session context
`accepted_at_unix_ms`	Server-side acceptance timestamp
`session_state`	Current session state after processing
`error`	Error details if `ok=false`

Streaming: watching coordination unfold

The StreamSession RPC provides a bidirectional gRPC stream bound to a single session. Subscribers receive accepted envelopes in authoritative order — the order the runtime accepted them, which is the canonical ordering for replay and audit.

// TypeScript — streaming
const stream = client.openStream({ auth: Auth.bearer('observer-token') });

// Send via stream
await stream.send(envelope);

// Receive accepted envelopes
for await (const received of stream.responses()) {
  console.log(received.messageType, received.sender);
  // Process in acceptance order
}

# Python — streaming
stream = client.open_stream()

stream.send(envelope)

for envelope in stream.responses(timeout=30.0):
    print(f"{envelope.message_type} from {envelope.sender}")

Client-side projections: making sense of the stream

Raw envelopes are useful, but agents usually want to know higher-level things: "How many votes does my proposal have? Is there a majority winner? Has anyone raised a blocking objection?" Both SDKs maintain client-side projections — pure state machines that track accepted envelopes and derive higher-level state locally:

// After voting
const totals = session.projection.voteTotals();
// { 'proposal-1': 3, 'proposal-2': 1 }

const winner = session.projection.majorityWinner();
// 'proposal-1'

const blocking = session.projection.hasBlockingObjection('proposal-1');
// false

These projections are "pure" in the functional programming sense — given the same sequence of accepted envelopes, they always produce the same state. This makes them safe for use in agent decision logic, because the agent's view of the session is always consistent with the runtime's authoritative ordering.

The Other Channel: Ambient Signals

Not everything in a multi-agent system is coordination. Sometimes agents need to broadcast status updates, heartbeats, or progress reports without binding them to a session outcome. MACP separates these two concerns into distinct planes (RFC-MACP-0001):

flowchart TB
    subgraph Ambient["Ambient Plane"]
        direction LR
        S1["Agent A"] -->|"Signal\n(session_id='', mode='')"| Bus["Signal Bus"]
        Bus --> Sub1["Subscriber 1"]
        Bus --> Sub2["Subscriber 2"]
    end

    subgraph Coordination["Coordination Plane"]
        direction LR
        M1["Agent A"] -->|"Envelope\n(session_id='abc', mode='decision.v1')"| Session["Session abc"]
        Session --> Log["Append-only Log"]
    end

    Ambient ~~~ Coordination

    style Ambient fill:#1a1a2e,stroke:#4a9eff
    style Coordination fill:#1a1a2e,stroke:#9f7aea

The separation is deliberate and important. Coordination messages enter a durable, ordered log and can affect session state. Signals do neither — they are ephemeral, non-binding, and broadcast to whoever is listening.

Signal semantics

The rules for signals are defined by what they cannot do:

Signals carry empty session_id and empty mode
Signals are non-binding — they MUST NOT create sessions, mutate session state, or produce binding outcomes
Signals are ephemeral — they are not required to enter durable replay history
Signals may include a correlation_session_id in their payload for informational cross-referencing
Signals are broadcast via the WatchSignals RPC to all subscribers

Signal types

The SignalPayload is intentionally flexible:

signal_type — Discriminator (e.g., "heartbeat", "status_update")
data — Arbitrary payload bytes
confidence — Optional confidence score
correlation_session_id — Optional session cross-reference (does NOT make the signal session-scoped)

Progress signals

One particularly useful signal type is ProgressPayload, designed for reporting work progress back to observers:

progress_token — Identifies the progress stream
progress / total — Numeric progress indicators
message — Human-readable status
target_message_id — Which message this progress relates to

In our deployment scenario, while the cost optimizer is evaluating proposals, it might broadcast progress signals: "Analyzing infrastructure costs... 40% complete." These signals let the UI show progress without polluting the coordination log with non-binding chatter.

Bridging Two Worlds: Control Plane and Runtime Interaction

We have seen the runtime's perspective (sessions, envelopes, modes) and the SDK's perspective (typed clients, projections). Now let us look at how the control plane bridges these two worlds — taking raw gRPC events from the runtime and transforming them into something a human operator can watch, query, and replay.

The event pipeline

Every event from the runtime passes through a normalization and projection pipeline before reaching the UI. This pipeline is where raw protocol events become meaningful operational data:

sequenceDiagram
    participant RT as MACP Runtime
    participant SC as Stream Consumer
    participant EN as Event Normalizer
    participant ES as Event Service
    participant PS as Projection Service
    participant MS as Metrics Service
    participant SH as Stream Hub
    participant UI as UI Client

    RT->>SC: Accepted envelope (gRPC stream)
    SC->>EN: Raw runtime event
    EN->>EN: Normalize to canonical event
    EN->>ES: Canonical event
    ES->>ES: Allocate sequence number (transactional)
    ES->>ES: Persist raw + canonical (atomic write)
    ES->>PS: Apply event to projection
    PS->>PS: Build RunStateProjection
    PS->>PS: Persist to run_projections
    ES->>MS: Record metrics (tokens, costs, counts)
    ES->>SH: Publish event
    SH->>UI: SSE (canonical_event)

Canonical event types

Events are normalized into a standard taxonomy. This normalization is what makes the control plane's UI possible — instead of dealing with raw Protobuf envelopes, the UI works with a clean, categorized event stream:

Category	Event Types
Run lifecycle	`run.created`, `run.started`, `run.completed`, `run.failed`, `run.cancelled`
Session	`session.bound`, `session.stream.opened`, `session.state.changed`
Participants	`participant.seen`
Messages	`message.sent`, `message.received`, `message.send_failed`
Signals	`signal.emitted`
Coordination	`proposal.created`, `decision.proposed`, `decision.finalized`
Tools	`tool.called`, `tool.completed`
Policy	`policy.resolved`, `policy.commitment.evaluated`, `policy.denied`

The RunStateProjection: a real-time read model

The projection engine builds a comprehensive read model from the event stream. This projection is what powers the control plane's UI — a single query returns the complete current state of a run:

interface RunStateProjection {
  run: RunSummaryProjection;           // Status, timing, mode
  participants: ParticipantProjection[]; // Activity per participant
  graph: GraphProjection;               // Message dependency graph
  decision: DecisionProjection;         // Decision-specific state
  signals: SignalProjection;            // Signal summary
  progress: ProgressProjection;         // Progress tracking
  timeline: TimelineProjection;         // Chronological events
  trace: TraceSummary;                  // Distributed trace info
  outboundMessages: OutboundMessageSummary;
  policy: PolicyProjection;            // Policy resolution status
}

Circuit breaker: failing gracefully

The runtime provider implements a circuit breaker pattern — a nod to the reality that distributed systems fail. If the runtime becomes unreachable, the circuit opens and rejects new requests immediately rather than waiting for timeouts. This prevents cascading failures: a slow runtime should not make the control plane slow, it should make it fast at returning errors. The circuit resets after a configurable cooldown.

SSE streaming to clients

UI consumers connect via GET /runs/:id/stream (Server-Sent Events), and the experience is designed for real-time watching:

On connect: receive a snapshot event with the full current RunStateProjection
As events occur: receive canonical_event messages in real-time
On disconnect: automatic reconnection with Last-Event-ID header for resumption

That snapshot-on-connect pattern is worth noting. A UI that connects mid-session does not have to replay the entire event history — it gets the current projection immediately, then stays in sync via the event stream.

Watching Everything: Observability

A coordination system that you cannot observe is a coordination system you cannot trust. MACP provides observability at every layer, from Rust-level structured logging to distributed traces that span from the UI through the control plane into the runtime.

flowchart LR
    subgraph Runtime["Runtime — Rust"]
        RL["tracing crate\nstructured logs"]
        RM["metrics.rs\nper-mode counters"]
        RO["OpenTelemetry\n(optional otel feature)"]
    end

    subgraph ControlPlane["Control Plane — NestJS"]
        CL["pino\nstructured JSON logs"]
        CM["prom-client\nPrometheus metrics"]
        CO["OpenTelemetry\nNode SDK"]
        CT["TraceService\nmanual spans"]
    end

    subgraph Endpoints["API Endpoints"]
        E1["GET /runs/:id/traces"]
        E2["GET /runs/:id/metrics"]
        E3["GET /runs/:id/artifacts"]
    end

    RL --> CL
    RM --> CM
    RO --> CO
    CO --> CT
    CT --> E1
    CM --> E2
    E3

Runtime observability

The Rust runtime uses the tracing crate for structured logging, controlled by the RUST_LOG environment variable. Every significant event is logged with structured fields:

Session creation: session_id, mode, sender
Message acceptance: session_id, message_type, sender, resulting state
Session resolution/expiry: session_id, mode
Auth failures and rate limit violations
Storage warnings during crash recovery

Per-mode metrics are tracked as atomic counters — lightweight enough to leave on in production:

sessions_started / sessions_resolved / sessions_expired / sessions_cancelled
messages_accepted / messages_rejected
commitments_accepted / commitments_rejected

OpenTelemetry support (enabled via the otel cargo feature) provides distributed tracing:

OTLP exporter configured via OTEL_EXPORTER_OTLP_ENDPOINT
Batch export integrated with tokio runtime
Trace context propagated via gRPC metadata

Control plane observability

The NestJS control plane adds its own observability layer:

Structured logging via pino with JSON output — machine-parseable, grep-friendly.

Prometheus metrics via prom-client, exposed for scraping by your existing monitoring infrastructure.

OpenTelemetry integration ties everything together:

Node SDK with auto-instrumentations
TraceService for manual span management
W3C Trace Context propagation (traceId flows from UI to Control Plane to Runtime)
Per-run traces accessible via GET /runs/:id/traces

Per-run metrics persisted in the run_metrics table provide granular accounting:

Event, message, and signal counts
Token usage extracted from event payloads
Estimated cost via model pricing lookup

Audit events

Both layers log security-relevant events per RFC-MACP-0004. In a world of autonomous agents, audit is not a nice-to-have — it is how you answer questions like "who tried to impersonate the security reviewer at 3 AM?":

Authentication failures
Authorization failures
Duplicate message rejections
Terminal state transitions
Cancellation events
Rate limit violations

When Things Go Wrong: Error Handling

Distributed systems fail. Agents send invalid messages. Networks partition. Runtimes crash. MACP does not pretend otherwise — it provides structured, consistent error handling across every layer, rooted in a shared taxonomy from the error code registry.

Runtime error codes

The runtime returns precise, actionable error codes. Notice how each code maps to an HTTP status, making it straightforward to surface errors in REST APIs:

Code	HTTP	When
`UNAUTHENTICATED`	401	Authentication failed or missing
`FORBIDDEN`	403	Authenticated but not authorized
`SESSION_NOT_FOUND`	404	Session ID doesn't exist
`SESSION_NOT_OPEN`	409	Session is RESOLVED or EXPIRED
`DUPLICATE_MESSAGE`	409	message_id already accepted in session
`SESSION_ALREADY_EXISTS`	409	SessionStart for existing session_id
`INVALID_ENVELOPE`	400	Envelope validation failed
`UNSUPPORTED_PROTOCOL_VERSION`	400	No mutual protocol version
`MODE_NOT_SUPPORTED`	400	Mode not available or not registered
`INVALID_SESSION_ID`	400	Session ID format invalid
`PAYLOAD_TOO_LARGE`	413	Exceeds maximum payload size
`RATE_LIMITED`	429	Too many requests from this sender
`UNKNOWN_POLICY_VERSION`	404	Policy not found in registry
`POLICY_DENIED`	403	Commitment rejected by governance rules
`INVALID_POLICY_DEFINITION`	400	Policy fails schema validation
`INTERNAL_ERROR`	500	Unrecoverable runtime error

Control plane error codes

The control plane adds its own error codes for orchestration-level failures — things the runtime does not know about because they happen in the layer above it:

Code	When
`RUN_NOT_FOUND`	Run ID doesn't exist
`INVALID_STATE_TRANSITION`	Cannot transition run to requested state
`RUNTIME_UNAVAILABLE`	Cannot connect to runtime
`RUNTIME_TIMEOUT`	gRPC deadline exceeded
`STREAM_EXHAUSTED`	Max stream reconnection retries exceeded
`SESSION_EXPIRED`	Runtime session expired during run
`KICKOFF_FAILED`	Initial kickoff message rejected
`MODE_NOT_SUPPORTED`	Requested mode not available on runtime
`CIRCUIT_BREAKER_OPEN`	Runtime circuit breaker is open
`MESSAGE_SEND_FAILED`	Mid-session message send failed

SDK exception hierarchy

Both SDKs wrap these error codes in typed exception hierarchies that make error handling in agent code clean and idiomatic.

TypeScript:

MacpSdkError (base)
├── MacpTransportError    — gRPC connection failure
├── MacpAckError          — Runtime NACK (carries ack.error.code)
├── MacpSessionError      — Session state violation
├── MacpTimeoutError      — Deadline exceeded
└── MacpRetryError        — All retries exhausted

Python:

MacpSdkError (base)
├── MacpAckError          — Runtime NACK (carries AckFailure)
├── MacpSessionError      — Session state violation
├── MacpTransportError    — gRPC failure
│   ├── MacpTimeoutError  — Deadline exceeded
│   └── MacpRetryError    — Retries exhausted

Error handling in practice

Back in our scenario, what happens if the cost optimizer tries to vote after the session has already been resolved? Here is how the SDKs handle it:

// TypeScript
try {
  await session.vote({ proposalId: 'p1', vote: 'APPROVE' });
} catch (err) {
  if (err instanceof MacpAckError) {
    // Runtime rejected the message
    console.error(err.ack.error?.code); // 'SESSION_NOT_OPEN', 'FORBIDDEN', etc.
  } else if (err instanceof MacpTransportError) {
    // gRPC connection issue — may be retryable
  } else if (err instanceof MacpTimeoutError) {
    // Deadline exceeded
  }
}

# Python
try:
    session.vote(proposal_id="p1", vote="APPROVE")
except MacpAckError as e:
    print(f"Rejected: {e.failure.code}")  # 'SESSION_NOT_OPEN', etc.
except MacpTransportError:
    print("Connection failed")
except MacpTimeoutError:
    print("Deadline exceeded")

The error hierarchy is designed so that the most specific exceptions are caught first. A MacpAckError means the runtime understood the message but rejected it — you need to look at the error code to decide what to do. A MacpTransportError means the message may not have reached the runtime at all — retrying might make sense. This distinction matters for building resilient agent logic.

Rewinding Time: Replay and Determinism

We saved one of the most powerful features for near the end, because replay only makes sense once you understand everything that came before it.

MACP provides a structural replay guarantee: replaying identical accepted Envelope sequences under identical bound versions MUST reproduce identical state transitions (RFC-MACP-0003). This is not an aspirational goal — it is an invariant enforced by every design decision we have discussed so far: immutable version binding, pure policy evaluation, authoritative ordering, and runtime-derived sender identity.

The determinism boundary

The replay engine takes a small, well-defined set of inputs and guarantees that a specific set of outputs will be identical:

flowchart LR
    subgraph Inputs["Deterministic Inputs"]
        H["Accepted Envelope\nSequence"]
        MV["mode_version"]
        CV["configuration_version"]
        PV["policy_version"]
        PRT["macp_version"]
    end

    subgraph Outputs["Guaranteed Identical"]
        ST["State transitions"]
        AD["Accept/reject decisions"]
        TS["Terminal state\nRESOLVED or EXPIRED"]
        TM["Terminal message"]
    end

    Inputs --> F["Deterministic\nReplay Engine"]
    F --> Outputs

What is guaranteed

Given identical accepted envelope history and identical bound versions:

Session lifecycle transitions are identical
Within-session acceptance order is identical
Idempotent duplicate handling is identical
Terminal state (RESOLVED/EXPIRED) and terminal message are identical

What is NOT guaranteed — and why that is fine

Not everything can or should be deterministic. The protocol is honest about its boundaries:

Semantic outcomes — Mode-defined results (e.g., Task mode is structural-only; external side effects may differ)
Error message text — May vary between runtime versions
Cross-session ordering — Only within-session order is deterministic
External side effects — Application responsibility

Determinism classes by mode

Each mode has a determinism class that tells you exactly what replay guarantees you get:

Mode	Class	Meaning
Decision	Semantic-deterministic	Same history + versions = same outcome
Proposal	Semantic-deterministic	Same history + versions = same outcome
Task	Structural-only	State transitions guaranteed; execution results may differ
Handoff	Context-frozen	Deterministic only if bound context replayed exactly
Quorum	Semantic-deterministic	Same ballots + threshold = same quorum state

Our deployment decision scenario uses Decision mode, which is semantic-deterministic. If you replay the same proposals, evaluations, and votes under the same mode, configuration, and policy versions, the outcome will always be "blue-green deploy, approved by majority." Always.

TTL determinism

Even time-based expiration is deterministic. Session TTL is computed from the SessionStart envelope's timestamp_unix_ms:

expiration = SessionStart.timestamp_unix_ms + ttl_ms

During replay, the pre-computed deadline from the original session is used — never wall-clock time. If TTL elapsed before a terminal condition was accepted, the session is EXPIRED. This means you can replay a session that originally ran for two minutes in two seconds, and the expiration logic still behaves correctly.

Replay via the control plane

The control plane supports three replay modes, each useful for different scenarios:

Mode	Behavior
`instant`	All events emitted immediately
`timed`	Events replayed with proportional inter-event timing (speed multiplier supported)
`step`	Events emitted one at a time on request

POST /runs/:id/replay        — Start replay session
GET  /runs/:id/replay/stream — SSE of replayed events
GET  /runs/:id/replay/state  — Projection at specific sequence number

The timed mode is particularly useful for post-mortems — you can watch a coordination session unfold at 10x speed, seeing exactly when each agent acted and how long deliberation took. The step mode is a debugger's best friend: advance one event at a time and inspect the projection at each step.

Storage and crash recovery

Here is an elegant detail: the same determinism guarantee that serves replay also serves crash recovery. The runtime uses an append-only log per session. On startup, sessions are recovered by replaying their logs through the same deterministic state machine. Crash recovery is just replay with a different trigger.

Storage backends:

FileBackend — session.json + log.jsonl per session (default)
RocksDB — Embedded key-value store
Redis — Shared storage for multi-instance deployments

Putting It All Together

Let us return one last time to our three agents — the architect, the security reviewer, and the cost optimizer — and trace their deployment decision through the entire system, from the first connection to the final commit.

sequenceDiagram
    participant UI as API Consumer
    participant CP as Control Plane
    participant RT as Runtime
    participant A as Agent A (SDK)
    participant B as Agent B (SDK)

    Note over UI,B: 1. Initialization
    A->>RT: Initialize
    B->>RT: Initialize
    UI->>CP: POST /runs (ExecutionRequest)

    Note over UI,B: 2. Session Creation
    CP->>RT: SessionStart (gRPC stream)
    RT->>RT: Validate, create session OPEN
    RT-->>CP: Ack (session bound)
    CP->>RT: Kickoff messages

    Note over UI,B: 3. Coordination
    A->>RT: Proposal
    RT-->>A: Ack
    RT-->>CP: Accepted envelope (stream)
    CP-->>UI: SSE canonical_event
    B->>RT: Vote
    RT-->>B: Ack
    RT-->>CP: Accepted envelope (stream)

    Note over UI,B: 4. Resolution
    A->>RT: Commitment
    RT->>RT: Policy evaluation (pure function)
    RT->>RT: Session → RESOLVED
    RT-->>A: Ack (session_state=RESOLVED)
    RT-->>CP: Session resolved
    CP->>CP: Normalize + project + persist
    CP-->>UI: SSE run.completed

    Note over UI,B: 5. Observability
    UI->>CP: GET /runs/:id/state
    CP-->>UI: RunStateProjection
    UI->>CP: GET /runs/:id/traces
    CP-->>UI: OpenTelemetry spans

The agents connected and negotiated capabilities. The control plane opened a session with majority voting policy. The architect proposed blue-green deployment. The security reviewer evaluated it favorably. The cost optimizer voted to approve. The architect committed the decision, the runtime evaluated the majority policy and confirmed it, and the session resolved.

Every step was authenticated. Every message was validated through the admission pipeline. Every event was persisted, normalized, and projected for the UI. The entire session can be replayed — instantly, at speed, or step by step — and will always produce the same outcome. Distributed traces connect every span from the UI through the control plane into the runtime. Audit logs capture every authentication attempt, every authorization decision, every state transition.

This is MACP's core promise: when autonomous agents need to produce one binding outcome, the protocol specification defines the rules, the runtime enforces them, the control plane orchestrates and observes, and the SDKs give agents a typed interface to participate. All four layers working together to turn a chaotic multi-agent conversation into a structured, auditable, replayable coordination process with a single authoritative result.

MACP End-to-End Flow

On this page