MACP

MACP End-to-End Flow

Comprehensive walkthrough of how all MACP components work together — from protocol specification through Rust runtime, NestJS control plane, and TypeScript/Python SDKs

Status: Non-normative (explanatory). In case of conflict, the referenced RFCs are authoritative.

References: RFC-MACP-0001 | RFC-MACP-0002 | RFC-MACP-0003 | RFC-MACP-0004 | RFC-MACP-0005 | RFC-MACP-0006 | RFC-MACP-0012

Imagine three AI agents sitting around a virtual table. One is an architect, another a security reviewer, and the third a cost optimizer. They need to agree on a deployment strategy for a critical Q3 release. Each has its own expertise, its own biases, and its own definition of "good enough." Left to their own devices, they might negotiate forever, contradict each other, or worse — two of them might reach one conclusion while the third reaches another.

This is the problem MACP solves. And this document is the story of how it solves it — tracing a coordination request from the moment an agent first introduces itself, through session creation, deliberation, voting, resolution, and all the way to replay and audit. Along the way, we will meet every layer of the system and understand not just what it does, but why it was designed that way.


The Four Layers: Why Architecture Matters

Think of MACP as a layered system built around a single conviction: when autonomous agents need to produce one binding outcome, the rules of engagement cannot be left to convention. They must be enforced.

At the bottom sits a protocol specification — twelve RFCs that define exactly what "coordination" means. Above that, a Rust runtime acts as the impartial referee, enforcing every rule the spec defines. The control plane adds orchestration and observability, turning raw protocol events into something a human operator can watch in real-time. And at the top, TypeScript and Python SDKs give agent developers a typed, ergonomic interface so they never have to think about envelope serialization or gRPC plumbing.

flowchart TB
    subgraph Clients["Agent Layer"]
        TS["TypeScript SDK"]
        PY["Python SDK"]
        UI["UI / API Consumer"]
    end

    subgraph CP["Control Plane — NestJS"]
        API["REST API + SSE"]
        Executor["Run Executor"]
        Normalizer["Event Normalizer"]
        Projection["Projection Engine"]
        DB[(PostgreSQL)]
    end

    subgraph RT["MACP Runtime — Rust"]
        Kernel["Coordination Kernel"]
        Modes["Mode Registry"]
        Policy["Policy Evaluator"]
        Storage["Storage Backend\nfile / rocksdb / redis"]
    end

    subgraph Spec["Protocol Specification"]
        RFCs["RFCs 0001–0012"]
        Schemas["Protobuf + JSON Schemas"]
        Registries["Mode · Policy · Error Registries"]
    end

    UI -->|"HTTP / SSE"| API
    API --> Executor
    Executor -->|"gRPC bidirectional stream"| Kernel
    Normalizer --> Projection --> DB
    TS -->|"gRPC"| Kernel
    PY -->|"gRPC"| Kernel
    Kernel --> Modes --> Policy --> Storage
    Spec -.->|"defines contracts for"| RT
    Spec -.->|"defines contracts for"| CP
    Spec -.->|"defines contracts for"| TS
    Spec -.->|"defines contracts for"| PY

This layered design also means there are two distinct ways to use the system, depending on how much infrastructure you want in the loop:

  • SDK-direct — Agents connect straight to the runtime via gRPC and manage their own session lifecycle. This is lightweight, fast, and requires no control plane at all. It is ideal for agent-to-agent coordination where no human needs to watch what is happening.
  • Control-plane-mediated — A UI or API consumer submits an ExecutionRequest to the control plane, which opens a runtime session on the agents' behalf, sends kickoff messages, streams every event through a normalization pipeline, and builds real-time projections for the UI. This is the path you take when observability, audit, and replay matter.

Both patterns use the same runtime, the same protocol, and the same Protobuf wire format defined in RFC-MACP-0006. The control plane is an addition, not a replacement.


Introducing Themselves: Agent Creation and Registration

Before any coordination can happen, agents need to introduce themselves. In the physical world, you would exchange business cards. In MACP, agents publish a manifest — a structured declaration of who they are, what they can do, and how to reach them.

What goes into a manifest

An agent manifest (RFC-MACP-0005) answers a handful of essential questions: What is your name? What can you do? What coordination modes do you support? What data formats do you speak?

FieldRequiredDescription
agent_idYesUnique identifier
titleYesHuman-readable name
descriptionYesWhat this agent does
supported_modesYesArray of mode identifiers the agent can participate in
input_content_typesYesMIME types the agent accepts
output_content_typesYesMIME types the agent produces
transport_endpointsNoArray of { transport, uri, content_types }
metadataNoArbitrary key-value pairs

Think of our architect agent: its manifest might declare support for macp.mode.decision.v1 and macp.mode.proposal.v1, accept application/json input, and produce application/json output. The security reviewer might support the same modes but also list macp.mode.quorum.v1 — because in its world, some decisions require a formal approval threshold.

How agents discover each other

Discovery is the moment agents learn who else is out there. MACP supports four mechanisms, ordered from simplest to most sophisticated:

  1. Well-known URLhttps://<host>/.well-known/macp.json returns the manifest as JSON. Simple, cacheable, works everywhere.
  2. GetManifest RPC — Programmatic discovery via gRPC. Pass an empty agent_id to get the serving runtime's own manifest.
  3. ListModes RPC — Returns only standards-track modes (macp.mode.decision.v1, etc.), useful for capability probing.
  4. Registry services — Organizations can index manifests for fleet-wide discovery, letting agents find each other across deployment boundaries.
// TypeScript — discover runtime capabilities
const manifest = await client.getManifest();
console.log(manifest.supportedModes); // ['macp.mode.decision.v1', ...]

const modes = await client.listModes();
// Returns ModeDescriptor[] with name, version, message types, determinism class
# Python — discover runtime capabilities
manifest = client.get_manifest()
print(manifest.supported_modes)

modes = client.list_modes()

With manifests published and discovery complete, our three agents now know about each other. The architect knows the security reviewer can participate in Decision mode. The cost optimizer knows the runtime supports the governance policies it needs. It is time to connect.


Connecting to the Runtime: SDK Initialization

Both SDKs follow a two-layer design that keeps things clean. At the bottom, a low-level MacpClient handles gRPC transport, authentication, and connection management. On top of that, high-level mode sessions (DecisionSession, ProposalSession, and so on) provide a typed, ergonomic API for each coordination mode. You never have to manually construct a Protobuf envelope if you do not want to.

Creating a client

The first step for any agent is creating a client connection to the runtime. Here is what that looks like:

// TypeScript
import { MacpClient, Auth } from '@macp/sdk';

const client = new MacpClient({
  address: '127.0.0.1:50051',
  secure: false,                    // TLS required in production
  auth: Auth.bearer('my-token'),    // or Auth.devAgent('agent-id')
  defaultDeadlineMs: 10_000,
});
# Python
from macp_sdk import MacpClient, AuthConfig

client = MacpClient(
    target="127.0.0.1:50051",
    secure=False,
    auth=AuthConfig.for_bearer("my-token"),  # or AuthConfig.for_dev_agent("agent-id")
    default_timeout=10.0,
)

The handshake: version and capability negotiation

Before anything else happens, the client and runtime perform a handshake — a version and capability negotiation defined in RFC-MACP-0001. This is not just a formality. The handshake establishes which protocol version both sides will speak, which optional features are available, and which coordination modes the runtime has loaded. Without it, neither side can make any assumptions about what the other supports.

sequenceDiagram
    participant Agent as SDK Client
    participant RT as Runtime

    Agent->>RT: Initialize(client_name, client_version, capabilities)
    RT->>Agent: InitializeResult(selected_version, runtime_info, supported_modes, capabilities)

    Note over Agent,RT: Capabilities negotiated:<br/>sessions.stream, cancellation,<br/>progress, manifest, mode_registry,<br/>roots, policy_registry

The runtime responds with its supported protocol version, the list of available modes, and which optional capabilities it supports. The SDK stores these for the lifetime of the client — no need to re-negotiate on every call.

const init = await client.initialize();
// init.runtimeInfo — { name, version }
// init.supportedModes — ModeDescriptor[]
// init.capabilities — { sessions, cancellation, progress, ... }

At this point, our architect agent has a live connection to the runtime. It knows the runtime supports Decision mode v1, that streaming is available, and that the policy registry is loaded. Now the question becomes: who kicks off the coordination? In many real-world scenarios, the answer is the control plane.


Orchestrating the Run: Control Plane Lifecycle

When coordination is mediated through the control plane — as it often is in production deployments where humans need visibility — the process follows a managed run lifecycle with well-defined state transitions. A "run" is the control plane's concept of a single coordination episode, from the moment someone requests it to the moment it completes, fails, or is cancelled.

The run state machine

The state machine is deliberately simple. Runs can only move forward — there is no going back from failed to running, and no way to resurrect a cancelled run. This simplicity is a feature: it makes the system easy to reason about and impossible to put into an inconsistent state.

stateDiagram-v2
    [*] --> queued: POST /runs
    queued --> starting: executor picks up
    starting --> binding_session: runtime session opened
    binding_session --> running: kickoff messages sent
    running --> completed: session resolved
    running --> failed: runtime error / stream lost
    running --> cancelled: POST /runs/:id/cancel
    starting --> failed: runtime unavailable
    binding_session --> failed: session start rejected
    completed --> [*]
    failed --> [*]
    cancelled --> [*]

From request to coordination: the execution flow

Let us follow what happens when a human operator (or an automated pipeline) submits a coordination request through the control plane API. The sequence is precise, and every step has a reason:

sequenceDiagram
    participant UI as API Consumer
    participant API as Control Plane API
    participant Exec as Run Executor
    participant Mgr as Run Manager
    participant RTP as Runtime Provider
    participant RT as MACP Runtime

    UI->>API: POST /runs (ExecutionRequest)
    API->>Exec: launch()
    Exec->>Mgr: createRun() → queued
    Exec->>Mgr: markStarted() → starting
    Exec->>RTP: initialize()
    RTP->>RT: Initialize RPC
    RT-->>RTP: InitializeResult
    Exec->>RTP: openSession()
    RTP->>RT: StreamSession (bidirectional gRPC)
    RT-->>RTP: SessionStart Ack
    Exec->>Mgr: bindSession() → binding_session
    Exec->>RTP: send kickoff messages
    RTP->>RT: Send envelopes
    Exec->>Mgr: markRunning() → running

    loop Event stream
        RT-->>RTP: Accepted envelopes
        RTP-->>Exec: Raw events
        Exec->>Exec: Normalize → Canonical events
        Exec->>Exec: Update projection
    end

    RT-->>RTP: Session resolved
    Exec->>Mgr: markCompleted() → completed
    API-->>UI: SSE stream / GET /runs/:id/state

What goes into an ExecutionRequest

The control plane needs to know everything up front. The ExecutionRequest is a fully resolved specification of what coordination should happen, containing:

  • mode — Which coordination mode to use (e.g., macp.mode.decision.v1)
  • runtime — Runtime address and kind (rust)
  • session — Participant list, TTL, policy version, context
  • kickoff — Array of initial messages to send after session creation
  • execution — Mode (live, replay, sandbox), tags, metadata

There is a design philosophy at work here: the control plane never makes assumptions. It does not guess which mode you want or which agents should participate. Everything is declared explicitly, making runs reproducible and auditable.

Persistence: everything gets recorded

The control plane persists everything to PostgreSQL. This is not optional — it is fundamental to the system's ability to provide observability, replay, and audit. Here is what gets stored:

TablePurpose
runsRun metadata, status, timing, error info
runtime_sessionsBound session metadata, mode, capabilities
run_events_rawRaw runtime events (append-only)
run_events_canonicalNormalized events for UI consumption
run_projectionsCurrent state cache (built from events)
run_metricsToken usage, cost estimates, event counts
run_artifactsTrace bundles, logs, generated reports

The separation between run_events_raw and run_events_canonical is worth noting. Raw events are preserved exactly as the runtime emitted them — they are the source of truth. Canonical events are a normalized, UI-friendly representation. By keeping both, the system can always re-derive canonical events from raw ones, which matters for replay and debugging.


Opening the Session: Where Coordination Begins

Now we arrive at the heart of the protocol. A coordination session begins with a SessionStart message — and this is the most validated message in the entire system. The reason is simple: everything that follows depends on the session being correctly configured. A bad session start would cascade into invalid state transitions, policy mismatches, and non-deterministic replays. So the runtime checks everything.

The validation gauntlet

When a SessionStart arrives, it passes through twelve validation steps before the session is created. Each step catches a different class of error, and the order matters — cheap checks (authentication, rate limiting) come before expensive ones (mode resolution, policy lookup).

sequenceDiagram
    participant Agent
    participant RT as Runtime

    Agent->>RT: Send(Envelope with SessionStart)
    RT->>RT: 1. Authenticate sender (bearer / mTLS / JWT)
    RT->>RT: 2. Derive sender identity from auth context
    RT->>RT: 3. Rate limit check (60 SessionStart/min default)
    RT->>RT: 4. Validate envelope structure (macp_version, mode, message_type)
    RT->>RT: 5. Validate session_id format (UUID v4/v7, ≥128 bits entropy)
    RT->>RT: 6. Check session_id not already in use
    RT->>RT: 7. Validate SessionStartPayload
    RT->>RT: 8. Resolve mode (must be registered)
    RT->>RT: 9. Resolve policy (policy_version → registry lookup)
    RT->>RT: 10. Create session: OPEN state
    RT->>RT: 11. Append to storage (commit point)
    RT->>RT: 12. Call mode.on_session_start()
    RT->>Agent: Ack(ok=true, session_state=OPEN)

What goes into a SessionStart

The SessionStartPayload declares everything the session needs to function. Some fields are required — you cannot start a session without participants or a TTL. Others are optional but powerful, like binding a governance policy or freezing ambient context.

FieldRequiredDescription
participantsYesNon-empty list of declared participant identifiers
mode_versionYesSemantic version of the mode (immutable for session)
configuration_versionYesConfiguration profile version (immutable for session)
ttl_msYesSession deadline in milliseconds (1 – 86,400,000)
policy_versionNoGovernance policy identifier; empty resolves to policy.default
contextNoFrozen context bound at session creation
rootsNoRoot descriptors for ambient context
intentNoHuman-readable session purpose

Version binding: the key to determinism

Here is a design decision that permeates the entire system. Three versions are immutably bound at session creation and cannot change for the session's lifetime:

  1. mode_version — Which semantic profile of the mode to use
  2. configuration_version — Voting/evaluation/acceptance profile
  3. policy_version — Governance rules

Why immutable? Because of deterministic replay. If you replay the same accepted history under the same bound versions, the runtime MUST produce identical state transitions. If versions could change mid-session, replay would be meaningless — you could never be sure whether a different outcome was caused by different agent behavior or different runtime configuration.

Starting our deployment decision

Let us return to our running example. The architect agent decides it is time to coordinate on the Q3 deployment strategy. Here is what that looks like through the SDKs:

// TypeScript — start a Decision session
const session = new DecisionSession(client, {
  modeVersion: '1.0.0',
  configurationVersion: '1.0.0',
  policyVersion: 'policy.majority',
  auth: Auth.bearer('coordinator-token'),
});

await session.start({
  intent: 'Choose deployment strategy for Q3 release',
  participants: ['architect-agent', 'security-agent', 'cost-agent'],
  ttlMs: 120_000, // 2 minutes
});
# Python — start a Decision session
session = DecisionSession(client, policy_version="policy.majority")

session.start(
    intent="Choose deployment strategy for Q3 release",
    participants=["architect-agent", "security-agent", "cost-agent"],
    ttl_ms=120_000,
)

Notice the policy.majority policy version. This tells the runtime to use majority voting rules when the time comes to evaluate a commitment. The architect agent has declared that a simple majority is enough — the cost optimizer does not get veto power. This is governance embedded in the protocol, not left to ad-hoc agent logic.

The session is now OPEN. Our three agents have two minutes to reach a decision.


The Admission Pipeline: Every Message Earns Its Place

With the session open, agents can start sending messages — proposals, evaluations, votes. But not every message gets through. Every single message from a participant passes through a strict admission pipeline before it can enter the session's accepted history. This is where the runtime earns its role as an impartial referee.

The pipeline, step by step

The pipeline is a chain of checks, each one acting as a gate. Fail any gate, and the message is rejected with a structured error. Pass them all, and the message is appended to the session's authoritative log.

flowchart LR
    A["Incoming\nEnvelope"] --> B["AuthN\nbearer / mTLS\nJWT / dev-header"]
    B --> C["Sender\nDerivation"]
    C --> D["Rate\nLimiting"]
    D --> E["Envelope\nValidation"]
    E --> F["Session\nLookup"]
    F --> G["Session\nOPEN?"]
    G --> H["Deduplication\nmessage_id"]
    H --> I["Participant\nCheck"]
    I --> J["Mode\nAuthorization"]
    J --> K["Append to\nLog"]
    K --> L["Mode\nDispatch"]

Two steps in this pipeline deserve special attention.

Authentication: you are who the runtime says you are

The runtime supports multiple authentication mechanisms (RFC-MACP-0004), from bearer tokens for typical production use to mTLS for high-security deployments:

MechanismHeaderUse Case
Bearer tokenAuthorization: Bearer <token>Production — tokens issued by control plane
mTLSTLS client certificateHigh-security deployments
JWT / OIDCAuthorization: Bearer <jwt>Federated identity
Dev headerx-macp-agent-id: <id>Local development only

Here is a design choice that matters enormously: the sender field in the Envelope is always overwritten by the runtime from the authenticated identity. Agents cannot self-assert their sender. Period. This single rule eliminates an entire class of impersonation attacks. When the security reviewer sees a proposal from "architect-agent," it knows the runtime verified that identity — not just that someone claimed to be the architect.

Per-token authorization

Each bearer token carries authorization metadata that constrains what the agent can do:

{
  "token": "abc123...",
  "sender": "architect-agent",
  "allowed_modes": ["macp.mode.decision.v1", "macp.mode.task.v1"],
  "can_start_sessions": true,
  "max_open_sessions": 10
}

Rate limiting: preventing runaway agents

Default limits enforced per authenticated sender keep any single agent from overwhelming the system:

LimitDefault
SessionStart messages per minute60
Session-scoped messages per minute600
Maximum payload size1 MB

In a world of autonomous agents, rate limiting is not just about fairness — it is about safety. An agent stuck in a retry loop should not be able to saturate the runtime and starve other sessions.

Mode-specific authorization: not everyone can do everything

Each coordination mode defines precisely who can send which message types. This is not a suggestion — it is enforced by the runtime at the protocol level:

ModeMessage TypeAuthorized Sender
DecisionProposal, Evaluation, Objection, VoteAny declared participant
DecisionCommitmentSession initiator (default)
ProposalProposal, CounterProposal, Accept, RejectAny participant
ProposalWithdrawAuthor of referenced proposal only
TaskTaskRequestSession initiator
TaskTaskUpdate, TaskComplete, TaskFailActive assignee only
HandoffHandoffOffer, HandoffContextCurrent responsibility owner
HandoffHandoffAccept, HandoffDeclineTarget participant of offer
QuorumApprove, Reject, AbstainAny eligible declared participant
QuorumApprovalRequest, CommitmentSession initiator

Back in our scenario: the cost optimizer cannot unilaterally commit the decision — only the architect (as session initiator) can do that. The security reviewer can propose, evaluate, and vote, but it cannot commit. These rules are not configurable per-session — they are baked into the mode definition. This is by design: mode semantics should be predictable and auditable, not customized into unpredictability.


Mode Invocation: How Coordination Actually Unfolds

Now that messages are flowing through the admission pipeline, it is time to understand what happens when they reach the mode itself. Modes are the semantic heart of MACP — they define the structure of coordination. Is it a vote? A negotiation? A task delegation? A responsibility transfer? The mode decides.

How the runtime dispatches to modes

When an accepted envelope reaches the mode layer, the runtime looks up the mode by name in its registry and calls into it through a well-defined trait interface. The mode can authorize the sender, process the message, update its internal state, and optionally resolve the session.

flowchart LR
    A["Accepted\nEnvelope"] --> B["Mode Registry\nlookup by name"]
    B --> C["mode.authorize_sender\nenvelope, session"]
    C --> D["mode.on_message\nenvelope, session state"]
    D --> E{"ModeResponse"}
    E -->|NoOp| F["No state change"]
    E -->|PersistState| G["Update mode state"]
    E -->|Resolve| H["Session → RESOLVED"]
    E -->|PersistAndResolve| I["Update + Resolve"]

The Mode trait in the Rust runtime is deliberately minimal. Every mode must implement exactly three methods — session start handling, message handling, and sender authorization. This constraint ensures modes are predictable, testable, and composable:

trait Mode: Send + Sync {
    fn on_session_start(&self, session: &Session, env: &Envelope)
        -> Result<ModeResponse, MacpError>;
    fn on_message(&self, session: &Session, env: &Envelope)
        -> Result<ModeResponse, MacpError>;
    fn authorize_sender(&self, session: &Session, env: &Envelope)
        -> Result<(), MacpError>;
}

The five standards-track modes

MACP ships with five coordination modes, each designed for a different interaction pattern. They range from structured group decision-making to simple task delegation:

ModeIdentifierParticipant ModelDeterminism
Decisionmacp.mode.decision.v1DeclaredSemantic-deterministic
Proposalmacp.mode.proposal.v1PeerSemantic-deterministic
Taskmacp.mode.task.v1OrchestratedStructural-only
Handoffmacp.mode.handoff.v1DelegatedContext-frozen
Quorummacp.mode.quorum.v1QuorumSemantic-deterministic

Let us walk through each one, because the differences matter.

Decision Mode: our deployment strategy scenario

This is the mode our three agents are using. Decision mode provides structured choice among proposals with explicit evaluation, objection, and voting phases. It is the most ceremony-heavy mode, but that ceremony exists for a reason — when the stakes are high enough to warrant three agents deliberating, you want a clear audit trail of who proposed what, who evaluated it, and how the vote went.

stateDiagram-v2
    [*] --> Proposing: SessionStart
    Proposing --> Evaluating: Proposal(s) submitted
    Evaluating --> Voting: Evaluation(s) submitted
    Voting --> Committed: Commitment accepted
    Committed --> [*]

    note right of Proposing: Any participant submits proposals
    note right of Evaluating: Participants evaluate with APPROVE/REVIEW/BLOCK/REJECT
    note right of Voting: Participants vote APPROVE/REJECT/ABSTAIN
    note right of Committed: Initiator binds outcome

In our scenario, this is where things get interesting. The architect proposes blue-green deployment. The security reviewer evaluates it — APPROVE, with high confidence. The cost optimizer votes in favor. Here is the code:

// Decision Mode — TypeScript
await session.propose({ proposalId: 'p1', option: 'Blue-green deploy', rationale: 'Zero downtime' });
await session.evaluate({ proposalId: 'p1', recommendation: 'APPROVE', confidence: 0.9 });
await session.vote({ proposalId: 'p1', vote: 'APPROVE' });
await session.commit({ action: 'decision.accepted', authorityScope: 'session', reason: 'Majority approved' });

Proposal Mode: when agents need to negotiate

Sometimes you do not want a formal vote — you want agents to negotiate. Proposal mode supports bounded negotiation with proposals and counterproposals. Think of it as a structured back-and-forth that must eventually converge or terminate.

stateDiagram-v2
    [*] --> Negotiating: SessionStart
    Negotiating --> Negotiating: Proposal / CounterProposal
    Negotiating --> Converged: Accept convergence
    Negotiating --> Rejected: Terminal Reject
    Converged --> Committed: Commitment
    Rejected --> Committed: Commitment
    Committed --> [*]

Task Mode: delegation with accountability

Task mode is the simplest interaction pattern: one agent requests work, another performs it. But even here, the protocol adds value — it tracks the task through acceptance, progress updates, and completion or failure, ensuring the requester always knows the current state.

stateDiagram-v2
    [*] --> Requested: SessionStart + TaskRequest
    Requested --> InProgress: TaskAccept
    Requested --> Unassigned: TaskReject
    InProgress --> Completed: TaskComplete
    InProgress --> Failed: TaskFail
    Completed --> Committed: Commitment
    Failed --> Committed: Commitment
    Committed --> [*]

Handoff Mode: passing the baton

When one agent needs to transfer responsibility to another — along with the context needed to continue — Handoff mode provides a structured transfer protocol. The offering agent can attach context, and the receiving agent explicitly accepts or declines.

stateDiagram-v2
    [*] --> Offered: SessionStart + HandoffOffer
    Offered --> Enriched: HandoffContext (optional)
    Enriched --> Accepted: HandoffAccept
    Enriched --> Declined: HandoffDecline
    Offered --> Accepted: HandoffAccept
    Offered --> Declined: HandoffDecline
    Accepted --> Committed: Commitment
    Declined --> Committed: Commitment
    Committed --> [*]

Quorum Mode: threshold-based approval

Quorum mode is for situations where you need a specific number of approvals from a pool of participants — think code review approvals, compliance sign-offs, or multi-party authorization. The mode tracks votes against a threshold and resolves when the threshold is met or becomes unreachable.

stateDiagram-v2
    [*] --> Voting: SessionStart + ApprovalRequest
    Voting --> Voting: Approve / Reject / Abstain
    Voting --> ThresholdMet: required_approvals reached
    Voting --> ThresholdUnreachable: remaining cannot reach threshold
    ThresholdMet --> Committed: Commitment (positive)
    ThresholdUnreachable --> Committed: Commitment (negative)
    Committed --> [*]

The one rule that unifies all five modes

Across all five modes, there is a single invariant that matters more than any other: only a Commitment message resolves the session. Intermediate outcome messages — TaskComplete, HandoffAccept, Approve — make the session eligible for commitment but do not transition the session to RESOLVED.

This is a deliberate design choice, and it is worth pausing to understand why. In a world of autonomous agents, the protocol needs a single, unambiguous moment when the outcome becomes binding. By separating "the work is done" from "the outcome is committed," MACP gives the initiator (or the policy-designated authority) final say. It also creates a clean audit point: when you see a Commitment in the log, you know the session is over and the outcome is authoritative.


Governance Built In: Policy Application

Our architect agent chose policy.majority when starting the session. But what does that actually mean? How does the runtime enforce governance rules? This is where MACP's policy system comes in — declarative governance rules that constrain how modes operate, defined in RFC-MACP-0012.

A two-phase lifecycle: resolve, then evaluate

Policies have an elegant two-phase lifecycle. In the first phase, at session start, the runtime resolves the policy — looking it up in the registry, validating it, and binding it immutably to the session. In the second phase, when a Commitment message arrives, the runtime evaluates the policy against the session's accumulated history to decide whether the commitment is allowed.

sequenceDiagram
    participant Agent
    participant RT as Runtime
    participant PR as Policy Registry
    participant PE as Policy Evaluator

    Note over Agent,PE: Phase 1: Resolution at SessionStart
    Agent->>RT: SessionStart(policy_version="policy.majority")
    RT->>PR: Lookup "policy.majority"
    PR-->>RT: PolicyDescriptor (rules, schema_version)
    RT->>RT: Bind policy immutably to session

    Note over Agent,PE: Phase 2: Evaluation at Commitment
    Agent->>RT: Commitment(action, reason)
    RT->>PE: Evaluate(policy_rules, accepted_history, participants)
    PE->>PE: Pure function — no I/O, no wall-clock, no randomness
    PE-->>RT: PolicyDecision::Allow
    RT->>Agent: Ack(ok=true, session_state=RESOLVED)

How policy resolution works

The resolution process is straightforward but strict:

  1. Extract policy_version from SessionStartPayload
  2. If empty, resolve to policy.default (no additional constraints)
  3. If non-empty, look up in policy registry
  4. If not found, reject with UNKNOWN_POLICY_VERSION
  5. If mode mismatch (policy targets different mode), reject with INVALID_POLICY_DEFINITION
  6. Store resolved PolicyDescriptor on the session — immutable for its lifetime

Notice step 5: a policy designed for Quorum mode cannot be used in a Decision mode session. This prevents subtle configuration errors where governance rules do not match the coordination semantics.

What policies can control

Each mode exposes different policy knobs. The policy system is not one-size-fits-all — it adapts to the semantics of the mode it governs:

ModePolicy Controls
DecisionVoting algorithm (majority/supermajority/weighted), quorum requirements, objection veto thresholds, commitment authority
ProposalAcceptance criteria (all_parties/counterparty/initiator), counter-proposal round limits, terminal rejection
TaskAllow reassignment on reject, require output on completion
HandoffImplicit accept timeout, commitment authority
QuorumThreshold override, abstention interpretation (neutral/implicit_reject/ignored)

In our scenario, policy.majority tells the Decision mode that a simple majority of votes is enough for the architect to commit the outcome. If the security reviewer had wanted veto power, a different policy — perhaps one with objection thresholds or supermajority requirements — would have been needed. The point is that these governance decisions are made explicitly at session creation, not implicitly during coordination.

The determinism guarantee

This is perhaps the most important property of the policy system. Policy evaluation is a pure function of:

  • The resolved policy rules (immutable for the session)
  • The accumulated accepted message history
  • The session's declared participants

It MUST NOT depend on wall-clock time, external services, randomness, or any state outside the session boundary. This ensures that policy decisions are identical during replay. If you replay a session and the same history leads to a different policy decision, something is deeply wrong.


The Wire: Message Flow and Results

We have talked about what messages mean. Now let us talk about how they travel. Every MACP message — proposals, votes, commitments, signals — is wrapped in a canonical Envelope (RFC-MACP-0001). The Envelope is the universal container that carries any message type through the system.

The Envelope: one format to carry them all

message Envelope {
  string macp_version = 1;     // Protocol version (e.g., "2026-03-02")
  string mode = 2;             // Empty for ambient signals
  string message_type = 3;     // Discriminator (e.g., "Proposal", "Vote")
  string message_id = 4;       // Unique ID for idempotency
  string session_id = 5;       // Empty for ambient signals
  string sender = 6;           // Authenticated identity (runtime-derived)
  int64 timestamp_unix_ms = 7; // Informational timestamp
  bytes payload = 8;           // Mode-defined content (protobuf-encoded)
}

The design here is worth appreciating. The Envelope separates routing information (session, mode, sender) from content (payload). The message_id enables idempotent delivery — send the same message twice, and the runtime will deduplicate. The sender field, as we discussed, is always runtime-derived from authentication, never self-asserted.

The Send/Ack cycle: truth is authoritative

The primary message pattern is deceptively simple — unary Send followed by Ack — but the semantics are precise:

sequenceDiagram
    participant A as Agent A
    participant RT as Runtime
    participant B as Agent B (streaming)

    A->>RT: Send(Envelope) — unary gRPC
    RT->>RT: Admission pipeline<br/>(auth → validate → dedup → append)
    RT->>A: Ack(ok=true, session_state, accepted_at)

    RT->>B: StreamSession: Accepted Envelope (in order)

    Note over A,B: Ack is authoritative per-message.<br/>StreamSession delivers accepted<br/>envelopes in order to subscribers.

The Ack is the runtime's authoritative verdict on a message. It tells the sender not just whether the message was accepted, but the current session state after processing:

FieldDescription
okWhether the message was accepted
duplicateWhether the message_id was already seen
message_idReference to the sent message
session_idSession context
accepted_at_unix_msServer-side acceptance timestamp
session_stateCurrent session state after processing
errorError details if ok=false

Streaming: watching coordination unfold

The StreamSession RPC provides a bidirectional gRPC stream bound to a single session. Subscribers receive accepted envelopes in authoritative order — the order the runtime accepted them, which is the canonical ordering for replay and audit.

// TypeScript — streaming
const stream = client.openStream({ auth: Auth.bearer('observer-token') });

// Send via stream
await stream.send(envelope);

// Receive accepted envelopes
for await (const received of stream.responses()) {
  console.log(received.messageType, received.sender);
  // Process in acceptance order
}
# Python — streaming
stream = client.open_stream()

stream.send(envelope)

for envelope in stream.responses(timeout=30.0):
    print(f"{envelope.message_type} from {envelope.sender}")

Client-side projections: making sense of the stream

Raw envelopes are useful, but agents usually want to know higher-level things: "How many votes does my proposal have? Is there a majority winner? Has anyone raised a blocking objection?" Both SDKs maintain client-side projections — pure state machines that track accepted envelopes and derive higher-level state locally:

// After voting
const totals = session.projection.voteTotals();
// { 'proposal-1': 3, 'proposal-2': 1 }

const winner = session.projection.majorityWinner();
// 'proposal-1'

const blocking = session.projection.hasBlockingObjection('proposal-1');
// false

These projections are "pure" in the functional programming sense — given the same sequence of accepted envelopes, they always produce the same state. This makes them safe for use in agent decision logic, because the agent's view of the session is always consistent with the runtime's authoritative ordering.


The Other Channel: Ambient Signals

Not everything in a multi-agent system is coordination. Sometimes agents need to broadcast status updates, heartbeats, or progress reports without binding them to a session outcome. MACP separates these two concerns into distinct planes (RFC-MACP-0001):

flowchart TB
    subgraph Ambient["Ambient Plane"]
        direction LR
        S1["Agent A"] -->|"Signal\n(session_id='', mode='')"| Bus["Signal Bus"]
        Bus --> Sub1["Subscriber 1"]
        Bus --> Sub2["Subscriber 2"]
    end

    subgraph Coordination["Coordination Plane"]
        direction LR
        M1["Agent A"] -->|"Envelope\n(session_id='abc', mode='decision.v1')"| Session["Session abc"]
        Session --> Log["Append-only Log"]
    end

    Ambient ~~~ Coordination

    style Ambient fill:#1a1a2e,stroke:#4a9eff
    style Coordination fill:#1a1a2e,stroke:#9f7aea

The separation is deliberate and important. Coordination messages enter a durable, ordered log and can affect session state. Signals do neither — they are ephemeral, non-binding, and broadcast to whoever is listening.

Signal semantics

The rules for signals are defined by what they cannot do:

  • Signals carry empty session_id and empty mode
  • Signals are non-binding — they MUST NOT create sessions, mutate session state, or produce binding outcomes
  • Signals are ephemeral — they are not required to enter durable replay history
  • Signals may include a correlation_session_id in their payload for informational cross-referencing
  • Signals are broadcast via the WatchSignals RPC to all subscribers

Signal types

The SignalPayload is intentionally flexible:

  • signal_type — Discriminator (e.g., "heartbeat", "status_update")
  • data — Arbitrary payload bytes
  • confidence — Optional confidence score
  • correlation_session_id — Optional session cross-reference (does NOT make the signal session-scoped)

Progress signals

One particularly useful signal type is ProgressPayload, designed for reporting work progress back to observers:

  • progress_token — Identifies the progress stream
  • progress / total — Numeric progress indicators
  • message — Human-readable status
  • target_message_id — Which message this progress relates to

In our deployment scenario, while the cost optimizer is evaluating proposals, it might broadcast progress signals: "Analyzing infrastructure costs... 40% complete." These signals let the UI show progress without polluting the coordination log with non-binding chatter.


Bridging Two Worlds: Control Plane and Runtime Interaction

We have seen the runtime's perspective (sessions, envelopes, modes) and the SDK's perspective (typed clients, projections). Now let us look at how the control plane bridges these two worlds — taking raw gRPC events from the runtime and transforming them into something a human operator can watch, query, and replay.

The event pipeline

Every event from the runtime passes through a normalization and projection pipeline before reaching the UI. This pipeline is where raw protocol events become meaningful operational data:

sequenceDiagram
    participant RT as MACP Runtime
    participant SC as Stream Consumer
    participant EN as Event Normalizer
    participant ES as Event Service
    participant PS as Projection Service
    participant MS as Metrics Service
    participant SH as Stream Hub
    participant UI as UI Client

    RT->>SC: Accepted envelope (gRPC stream)
    SC->>EN: Raw runtime event
    EN->>EN: Normalize to canonical event
    EN->>ES: Canonical event
    ES->>ES: Allocate sequence number (transactional)
    ES->>ES: Persist raw + canonical (atomic write)
    ES->>PS: Apply event to projection
    PS->>PS: Build RunStateProjection
    PS->>PS: Persist to run_projections
    ES->>MS: Record metrics (tokens, costs, counts)
    ES->>SH: Publish event
    SH->>UI: SSE (canonical_event)

Canonical event types

Events are normalized into a standard taxonomy. This normalization is what makes the control plane's UI possible — instead of dealing with raw Protobuf envelopes, the UI works with a clean, categorized event stream:

CategoryEvent Types
Run lifecyclerun.created, run.started, run.completed, run.failed, run.cancelled
Sessionsession.bound, session.stream.opened, session.state.changed
Participantsparticipant.seen
Messagesmessage.sent, message.received, message.send_failed
Signalssignal.emitted
Coordinationproposal.created, decision.proposed, decision.finalized
Toolstool.called, tool.completed
Policypolicy.resolved, policy.commitment.evaluated, policy.denied

The RunStateProjection: a real-time read model

The projection engine builds a comprehensive read model from the event stream. This projection is what powers the control plane's UI — a single query returns the complete current state of a run:

interface RunStateProjection {
  run: RunSummaryProjection;           // Status, timing, mode
  participants: ParticipantProjection[]; // Activity per participant
  graph: GraphProjection;               // Message dependency graph
  decision: DecisionProjection;         // Decision-specific state
  signals: SignalProjection;            // Signal summary
  progress: ProgressProjection;         // Progress tracking
  timeline: TimelineProjection;         // Chronological events
  trace: TraceSummary;                  // Distributed trace info
  outboundMessages: OutboundMessageSummary;
  policy: PolicyProjection;            // Policy resolution status
}

Circuit breaker: failing gracefully

The runtime provider implements a circuit breaker pattern — a nod to the reality that distributed systems fail. If the runtime becomes unreachable, the circuit opens and rejects new requests immediately rather than waiting for timeouts. This prevents cascading failures: a slow runtime should not make the control plane slow, it should make it fast at returning errors. The circuit resets after a configurable cooldown.

SSE streaming to clients

UI consumers connect via GET /runs/:id/stream (Server-Sent Events), and the experience is designed for real-time watching:

  1. On connect: receive a snapshot event with the full current RunStateProjection
  2. As events occur: receive canonical_event messages in real-time
  3. On disconnect: automatic reconnection with Last-Event-ID header for resumption

That snapshot-on-connect pattern is worth noting. A UI that connects mid-session does not have to replay the entire event history — it gets the current projection immediately, then stays in sync via the event stream.


Watching Everything: Observability

A coordination system that you cannot observe is a coordination system you cannot trust. MACP provides observability at every layer, from Rust-level structured logging to distributed traces that span from the UI through the control plane into the runtime.

flowchart LR
    subgraph Runtime["Runtime — Rust"]
        RL["tracing crate\nstructured logs"]
        RM["metrics.rs\nper-mode counters"]
        RO["OpenTelemetry\n(optional otel feature)"]
    end

    subgraph ControlPlane["Control Plane — NestJS"]
        CL["pino\nstructured JSON logs"]
        CM["prom-client\nPrometheus metrics"]
        CO["OpenTelemetry\nNode SDK"]
        CT["TraceService\nmanual spans"]
    end

    subgraph Endpoints["API Endpoints"]
        E1["GET /runs/:id/traces"]
        E2["GET /runs/:id/metrics"]
        E3["GET /runs/:id/artifacts"]
    end

    RL --> CL
    RM --> CM
    RO --> CO
    CO --> CT
    CT --> E1
    CM --> E2
    E3

Runtime observability

The Rust runtime uses the tracing crate for structured logging, controlled by the RUST_LOG environment variable. Every significant event is logged with structured fields:

  • Session creation: session_id, mode, sender
  • Message acceptance: session_id, message_type, sender, resulting state
  • Session resolution/expiry: session_id, mode
  • Auth failures and rate limit violations
  • Storage warnings during crash recovery

Per-mode metrics are tracked as atomic counters — lightweight enough to leave on in production:

  • sessions_started / sessions_resolved / sessions_expired / sessions_cancelled
  • messages_accepted / messages_rejected
  • commitments_accepted / commitments_rejected

OpenTelemetry support (enabled via the otel cargo feature) provides distributed tracing:

  • OTLP exporter configured via OTEL_EXPORTER_OTLP_ENDPOINT
  • Batch export integrated with tokio runtime
  • Trace context propagated via gRPC metadata

Control plane observability

The NestJS control plane adds its own observability layer:

Structured logging via pino with JSON output — machine-parseable, grep-friendly.

Prometheus metrics via prom-client, exposed for scraping by your existing monitoring infrastructure.

OpenTelemetry integration ties everything together:

  • Node SDK with auto-instrumentations
  • TraceService for manual span management
  • W3C Trace Context propagation (traceId flows from UI to Control Plane to Runtime)
  • Per-run traces accessible via GET /runs/:id/traces

Per-run metrics persisted in the run_metrics table provide granular accounting:

  • Event, message, and signal counts
  • Token usage extracted from event payloads
  • Estimated cost via model pricing lookup

Audit events

Both layers log security-relevant events per RFC-MACP-0004. In a world of autonomous agents, audit is not a nice-to-have — it is how you answer questions like "who tried to impersonate the security reviewer at 3 AM?":

  • Authentication failures
  • Authorization failures
  • Duplicate message rejections
  • Terminal state transitions
  • Cancellation events
  • Rate limit violations

When Things Go Wrong: Error Handling

Distributed systems fail. Agents send invalid messages. Networks partition. Runtimes crash. MACP does not pretend otherwise — it provides structured, consistent error handling across every layer, rooted in a shared taxonomy from the error code registry.

Runtime error codes

The runtime returns precise, actionable error codes. Notice how each code maps to an HTTP status, making it straightforward to surface errors in REST APIs:

CodeHTTPWhen
UNAUTHENTICATED401Authentication failed or missing
FORBIDDEN403Authenticated but not authorized
SESSION_NOT_FOUND404Session ID doesn't exist
SESSION_NOT_OPEN409Session is RESOLVED or EXPIRED
DUPLICATE_MESSAGE409message_id already accepted in session
SESSION_ALREADY_EXISTS409SessionStart for existing session_id
INVALID_ENVELOPE400Envelope validation failed
UNSUPPORTED_PROTOCOL_VERSION400No mutual protocol version
MODE_NOT_SUPPORTED400Mode not available or not registered
INVALID_SESSION_ID400Session ID format invalid
PAYLOAD_TOO_LARGE413Exceeds maximum payload size
RATE_LIMITED429Too many requests from this sender
UNKNOWN_POLICY_VERSION404Policy not found in registry
POLICY_DENIED403Commitment rejected by governance rules
INVALID_POLICY_DEFINITION400Policy fails schema validation
INTERNAL_ERROR500Unrecoverable runtime error

Control plane error codes

The control plane adds its own error codes for orchestration-level failures — things the runtime does not know about because they happen in the layer above it:

CodeWhen
RUN_NOT_FOUNDRun ID doesn't exist
INVALID_STATE_TRANSITIONCannot transition run to requested state
RUNTIME_UNAVAILABLECannot connect to runtime
RUNTIME_TIMEOUTgRPC deadline exceeded
STREAM_EXHAUSTEDMax stream reconnection retries exceeded
SESSION_EXPIREDRuntime session expired during run
KICKOFF_FAILEDInitial kickoff message rejected
MODE_NOT_SUPPORTEDRequested mode not available on runtime
CIRCUIT_BREAKER_OPENRuntime circuit breaker is open
MESSAGE_SEND_FAILEDMid-session message send failed

SDK exception hierarchy

Both SDKs wrap these error codes in typed exception hierarchies that make error handling in agent code clean and idiomatic.

TypeScript:

MacpSdkError (base)
├── MacpTransportError    — gRPC connection failure
├── MacpAckError          — Runtime NACK (carries ack.error.code)
├── MacpSessionError      — Session state violation
├── MacpTimeoutError      — Deadline exceeded
└── MacpRetryError        — All retries exhausted

Python:

MacpSdkError (base)
├── MacpAckError          — Runtime NACK (carries AckFailure)
├── MacpSessionError      — Session state violation
├── MacpTransportError    — gRPC failure
│   ├── MacpTimeoutError  — Deadline exceeded
│   └── MacpRetryError    — Retries exhausted

Error handling in practice

Back in our scenario, what happens if the cost optimizer tries to vote after the session has already been resolved? Here is how the SDKs handle it:

// TypeScript
try {
  await session.vote({ proposalId: 'p1', vote: 'APPROVE' });
} catch (err) {
  if (err instanceof MacpAckError) {
    // Runtime rejected the message
    console.error(err.ack.error?.code); // 'SESSION_NOT_OPEN', 'FORBIDDEN', etc.
  } else if (err instanceof MacpTransportError) {
    // gRPC connection issue — may be retryable
  } else if (err instanceof MacpTimeoutError) {
    // Deadline exceeded
  }
}
# Python
try:
    session.vote(proposal_id="p1", vote="APPROVE")
except MacpAckError as e:
    print(f"Rejected: {e.failure.code}")  # 'SESSION_NOT_OPEN', etc.
except MacpTransportError:
    print("Connection failed")
except MacpTimeoutError:
    print("Deadline exceeded")

The error hierarchy is designed so that the most specific exceptions are caught first. A MacpAckError means the runtime understood the message but rejected it — you need to look at the error code to decide what to do. A MacpTransportError means the message may not have reached the runtime at all — retrying might make sense. This distinction matters for building resilient agent logic.


Rewinding Time: Replay and Determinism

We saved one of the most powerful features for near the end, because replay only makes sense once you understand everything that came before it.

MACP provides a structural replay guarantee: replaying identical accepted Envelope sequences under identical bound versions MUST reproduce identical state transitions (RFC-MACP-0003). This is not an aspirational goal — it is an invariant enforced by every design decision we have discussed so far: immutable version binding, pure policy evaluation, authoritative ordering, and runtime-derived sender identity.

The determinism boundary

The replay engine takes a small, well-defined set of inputs and guarantees that a specific set of outputs will be identical:

flowchart LR
    subgraph Inputs["Deterministic Inputs"]
        H["Accepted Envelope\nSequence"]
        MV["mode_version"]
        CV["configuration_version"]
        PV["policy_version"]
        PRT["macp_version"]
    end

    subgraph Outputs["Guaranteed Identical"]
        ST["State transitions"]
        AD["Accept/reject decisions"]
        TS["Terminal state\nRESOLVED or EXPIRED"]
        TM["Terminal message"]
    end

    Inputs --> F["Deterministic\nReplay Engine"]
    F --> Outputs

What is guaranteed

Given identical accepted envelope history and identical bound versions:

  • Session lifecycle transitions are identical
  • Within-session acceptance order is identical
  • Idempotent duplicate handling is identical
  • Terminal state (RESOLVED/EXPIRED) and terminal message are identical

What is NOT guaranteed — and why that is fine

Not everything can or should be deterministic. The protocol is honest about its boundaries:

  • Semantic outcomes — Mode-defined results (e.g., Task mode is structural-only; external side effects may differ)
  • Error message text — May vary between runtime versions
  • Cross-session ordering — Only within-session order is deterministic
  • External side effects — Application responsibility

Determinism classes by mode

Each mode has a determinism class that tells you exactly what replay guarantees you get:

ModeClassMeaning
DecisionSemantic-deterministicSame history + versions = same outcome
ProposalSemantic-deterministicSame history + versions = same outcome
TaskStructural-onlyState transitions guaranteed; execution results may differ
HandoffContext-frozenDeterministic only if bound context replayed exactly
QuorumSemantic-deterministicSame ballots + threshold = same quorum state

Our deployment decision scenario uses Decision mode, which is semantic-deterministic. If you replay the same proposals, evaluations, and votes under the same mode, configuration, and policy versions, the outcome will always be "blue-green deploy, approved by majority." Always.

TTL determinism

Even time-based expiration is deterministic. Session TTL is computed from the SessionStart envelope's timestamp_unix_ms:

expiration = SessionStart.timestamp_unix_ms + ttl_ms

During replay, the pre-computed deadline from the original session is used — never wall-clock time. If TTL elapsed before a terminal condition was accepted, the session is EXPIRED. This means you can replay a session that originally ran for two minutes in two seconds, and the expiration logic still behaves correctly.

Replay via the control plane

The control plane supports three replay modes, each useful for different scenarios:

ModeBehavior
instantAll events emitted immediately
timedEvents replayed with proportional inter-event timing (speed multiplier supported)
stepEvents emitted one at a time on request
POST /runs/:id/replay        — Start replay session
GET  /runs/:id/replay/stream — SSE of replayed events
GET  /runs/:id/replay/state  — Projection at specific sequence number

The timed mode is particularly useful for post-mortems — you can watch a coordination session unfold at 10x speed, seeing exactly when each agent acted and how long deliberation took. The step mode is a debugger's best friend: advance one event at a time and inspect the projection at each step.

Storage and crash recovery

Here is an elegant detail: the same determinism guarantee that serves replay also serves crash recovery. The runtime uses an append-only log per session. On startup, sessions are recovered by replaying their logs through the same deterministic state machine. Crash recovery is just replay with a different trigger.

Storage backends:

  • FileBackendsession.json + log.jsonl per session (default)
  • RocksDB — Embedded key-value store
  • Redis — Shared storage for multi-instance deployments

Putting It All Together

Let us return one last time to our three agents — the architect, the security reviewer, and the cost optimizer — and trace their deployment decision through the entire system, from the first connection to the final commit.

sequenceDiagram
    participant UI as API Consumer
    participant CP as Control Plane
    participant RT as Runtime
    participant A as Agent A (SDK)
    participant B as Agent B (SDK)

    Note over UI,B: 1. Initialization
    A->>RT: Initialize
    B->>RT: Initialize
    UI->>CP: POST /runs (ExecutionRequest)

    Note over UI,B: 2. Session Creation
    CP->>RT: SessionStart (gRPC stream)
    RT->>RT: Validate, create session OPEN
    RT-->>CP: Ack (session bound)
    CP->>RT: Kickoff messages

    Note over UI,B: 3. Coordination
    A->>RT: Proposal
    RT-->>A: Ack
    RT-->>CP: Accepted envelope (stream)
    CP-->>UI: SSE canonical_event
    B->>RT: Vote
    RT-->>B: Ack
    RT-->>CP: Accepted envelope (stream)

    Note over UI,B: 4. Resolution
    A->>RT: Commitment
    RT->>RT: Policy evaluation (pure function)
    RT->>RT: Session → RESOLVED
    RT-->>A: Ack (session_state=RESOLVED)
    RT-->>CP: Session resolved
    CP->>CP: Normalize + project + persist
    CP-->>UI: SSE run.completed

    Note over UI,B: 5. Observability
    UI->>CP: GET /runs/:id/state
    CP-->>UI: RunStateProjection
    UI->>CP: GET /runs/:id/traces
    CP-->>UI: OpenTelemetry spans

The agents connected and negotiated capabilities. The control plane opened a session with majority voting policy. The architect proposed blue-green deployment. The security reviewer evaluated it favorably. The cost optimizer voted to approve. The architect committed the decision, the runtime evaluated the majority policy and confirmed it, and the session resolved.

Every step was authenticated. Every message was validated through the admission pipeline. Every event was persisted, normalized, and projected for the UI. The entire session can be replayed — instantly, at speed, or step by step — and will always produce the same outcome. Distributed traces connect every span from the UI through the control plane into the runtime. Audit logs capture every authentication attempt, every authorization decision, every state transition.

This is MACP's core promise: when autonomous agents need to produce one binding outcome, the protocol specification defines the rules, the runtime enforces them, the control plane orchestrates and observes, and the SDKs give agents a typed interface to participate. All four layers working together to turn a chaotic multi-agent conversation into a structured, auditable, replayable coordination process with a single authoritative result.

On this page

The Four Layers: Why Architecture Matters
Introducing Themselves: Agent Creation and Registration
What goes into a manifest
How agents discover each other
Connecting to the Runtime: SDK Initialization
Creating a client
The handshake: version and capability negotiation
Orchestrating the Run: Control Plane Lifecycle
The run state machine
From request to coordination: the execution flow
What goes into an ExecutionRequest
Persistence: everything gets recorded
Opening the Session: Where Coordination Begins
The validation gauntlet
What goes into a SessionStart
Version binding: the key to determinism
Starting our deployment decision
The Admission Pipeline: Every Message Earns Its Place
The pipeline, step by step
Authentication: you are who the runtime says you are
Per-token authorization
Rate limiting: preventing runaway agents
Mode-specific authorization: not everyone can do everything
Mode Invocation: How Coordination Actually Unfolds
How the runtime dispatches to modes
The five standards-track modes
Decision Mode: our deployment strategy scenario
Proposal Mode: when agents need to negotiate
Task Mode: delegation with accountability
Handoff Mode: passing the baton
Quorum Mode: threshold-based approval
The one rule that unifies all five modes
Governance Built In: Policy Application
A two-phase lifecycle: resolve, then evaluate
How policy resolution works
What policies can control
The determinism guarantee
The Wire: Message Flow and Results
The Envelope: one format to carry them all
The Send/Ack cycle: truth is authoritative
Streaming: watching coordination unfold
Client-side projections: making sense of the stream
The Other Channel: Ambient Signals
Signal semantics
Signal types
Progress signals
Bridging Two Worlds: Control Plane and Runtime Interaction
The event pipeline
Canonical event types
The RunStateProjection: a real-time read model
Circuit breaker: failing gracefully
SSE streaming to clients
Watching Everything: Observability
Runtime observability
Control plane observability
Audit events
When Things Go Wrong: Error Handling
Runtime error codes
Control plane error codes
SDK exception hierarchy
Error handling in practice
Rewinding Time: Replay and Determinism
The determinism boundary
What is guaranteed
What is NOT guaranteed — and why that is fine
Determinism classes by mode
TTL determinism
Replay via the control plane
Storage and crash recovery
Putting It All Together