Every Publish Runs a Proof
We just made formal verification a hard gate on the AgentiCraft marketplace. Five checks catch typos, workflow deadlocks, disconnected topologies, and missing privilege attenuation before any app reaches a user.
A few months ago we released agenticraft-foundation — a standalone Python library that proves multi-agent protocols correct using process algebra, multiparty session types, and spectral graph theory. Apache 2.0, ~1,400 tests, real research backing.
It sat in a package. Anyone could install it. Almost nobody would.
That's the gap between having a verification library and using one. The same gap that separates a paper from a product. We just closed it on our side: every app published to the AgentiCraft marketplace now passes through the foundation's verification engine before the tarball ships. No flag to enable. No CI hook to wire. No reason to skip it. The gate runs.
This post is about what that means in practice.
The Bug a Schema Can't Catch
Strict schemas catch a lot. Our app.yaml rejects unknown top-level keys, validates field types, ensures every required name is present. A typo in descripton: fails at parse time. A missing kind: fails at parse time. That class of bug never reaches review.
What schemas can't catch is coordination. Here's an app.yaml that loads cleanly, passes every pydantic validator, satisfies the marketplace quality gate, and is structurally broken in a way that only manifests at runtime:
agents:
- id: planner
- id: builder
- id: tester
topology:
connections:
- from: planner
to: builder
- from: builder
to: tester
- from: tester
to: planer # ← typo: missing 'n'A schema accepts this. The string "planer" is a valid string. The runtime won't notice until the tester tries to hand off and planer doesn't resolve — by which point you have a half-completed workflow, dangling state, and a user wondering why the deploy hung.
Now multiply that across agent IDs, workflow step references, broadcast group members, trust-boundary listings, and depends_on clauses. A reasonably-sized app manifest has dozens of internal references. Catching all of them by hand is exactly the kind of work a verifier should be doing.
That's the easy case. The hard case is when nothing is misspelled, every reference resolves, every schema check passes — and the protocol still deadlocks under load. The library exists for that. The gate is what makes it count.
What the Gate Does
craft push now has a new step between the quality gate and the tarball:
Quality gate passed
VERIFY INFO: topology.connected: Graph is disconnected (lambda_2=0.0000, coordination_mode=orchestrated)
VERIFY WARN: topology.fault_tolerant: Topology has 5 articulation point(s)
Formal verification passed (5 checks, 1 warning, 24.6 ms)
Five checks run on every manifest. Each takes single-digit milliseconds. Each is grounded in a primitive from the foundation library — no heuristics, no fuzzy matching, no LLM-judge tricks. Let me walk through them.
references.integrity
Every agent ID referenced anywhere must resolve to a defined agent. Topology connections, broadcast group members, trust boundary listings, workflow step agent: fields, parallel-step agents: lists, workflow depends_on clauses. The check enumerates every reference site and asserts existence. The "planer" typo above fires here with a clear message:
references.integrity: 1 reference integrity violation(s)
topology.connections[2].to includes 'planer' which is not a defined agent
Severity: error. A typo in a reference is never a "feature." The gate blocks.
topology.connected
Build the agent communication graph from topology.connections, compute the Laplacian, look at the second-smallest eigenvalue. If λ₂ = 0, the graph is disconnected — some agents can't reach others through any chain of declared edges.
This is where it gets interesting. A disconnected graph isn't always a bug. Some apps coordinate every agent through an external controller (a Telegram bot, a CLI, a UI adapter) and the agents themselves never talk to each other. Their "topology" is intentionally a set of isolated nodes. For others — say a research pipeline with a planner, retriever, analyst, and writer all chained through A2A messaging — a disconnected graph is a real partition.
The verifier shouldn't guess which case it's looking at. We added a field for the operator to declare it: topology.coordination_mode: orchestrated | a2a | hybrid. The verifier then sets the severity:
a2a→ disconnected = errorhybrid→ disconnected = warningorchestrated(default) → disconnected = info
Declared intent over verifier inference. Same primitive (Laplacian connectivity), three different verdicts depending on what the app says it's doing.
topology.fault_tolerant
An articulation point is a node whose removal increases the number of connected components in the graph. In a hub-and-spoke topology, the hub is an articulation point: lose it and every spoke becomes isolated. Tarjan's algorithm finds them in linear time.
This check fires as a warning, not an error. Hub-and-spoke is a legitimate design — it's how a lot of real systems work, including AgentiCraft's own Personal Mesh, where a briefing agent collects from a financial and calendar agent. The check surfaces the trade-off so the operator can confirm: "yes, I know losing the hub disconnects everything; that's the design." If you want it to block, run with --strict.
topology.privilege_flow
When two agents live in different trust boundaries, a connection between them is a capability transfer. CSP literature calls this attenuation: the higher-privileged side must explicitly declare which capabilities are revoked when invoking the lower-privileged side, preventing privilege escalation by association.
Our schema has a privilege_attenuation field on TopologyConnection for exactly this purpose. The check walks every connection, looks up the trust boundary of each end, and if they differ, asserts that privilege_attenuation is non-empty. Severity: error. A cross-boundary connection without declared attenuation is a security review failing in slow motion.
workflow.deadlock_free
This is the one the foundation was built for. Each workflow is translated into a CSP process — pipeline patterns become sequential composition, fan-out-then-synthesize becomes parallel composition synchronized on completion, depends_on clauses become precedence DAGs, parallel step types expand into sub-process composition. The foundation builds the labeled transition system, exhaustively explores every reachable state, and asserts that none of them is a deadlock.
This is not "we ran some test cases." Every possible execution path through the workflow's coordination shape is checked. If a state exists where no agent can make progress, the check returns it, complete with the trace that led there. Severity: error.
For Personal Mesh's morning briefing — a pipeline of gather-expenses → gather-calendar → synthesize — the verifier explores a 5-state LTS in under a millisecond and certifies deadlock-freedom. For a more complex fan-out workflow, the LTS grows combinatorially. The foundation caps exploration at 10,000 states; beyond that it tells you the bound was hit so you know the certification is partial.
What Operator-Declared Intent Buys You
The coordination_mode decision deserves a longer note because it's the design pattern this whole gate rests on.
A verifier without declared intent has two failure modes. Either it imposes a coordination model the operator didn't ask for (which makes the gate a nuisance you route around), or it accepts everything by guessing benignly (which makes the gate cosmetic). Neither is a gate worth running.
The fix is to make the intent part of the manifest. The operator says what coordination shape this app has; the verifier checks whether the rest of the manifest matches that declaration. A disconnected graph in orchestrated mode is correct. The same shape in a2a mode is a bug. Same check, same primitive, opposite verdict — because the contract is different.
This generalizes. We'll add the same shape for topology.groups[] (broadcast protocols, when the schema gains a publisher role declaration), for workflow liveness (CTL AF(complete) becomes meaningful when iterative workflows ship max_iterations bounds), and for strict capability typing on mcp_clients when the MCP ecosystem stabilizes its schema. The pattern is the same: the manifest declares; the verifier checks.
Strict Mode for Enterprise Gating
The default report passes if every error-severity check passes. Warnings are informational — articulation points, structural concerns, mode mismatches in hybrid configurations.
For enterprise deployments where structural concerns should block, verify() takes a strict=True parameter. The CLI mirrors it:
python -m agenticraft_foundation app.yaml --strictSame checks, same severity labels on each check. Only the report verdict flips: any failed warning fails the report. The operator can opt into the stricter gate without changing the schema or the checks.
We're using this internally for the marketplace's enterprise channel — apps published there pass --strict. Apps published to the general channel pass the relaxed gate. Same engine, two policy regimes.
A Real Publish, End to End
Here's Personal Mesh — our reference 5-agent app — running through craft push:
Quality gate passed
VERIFY INFO: topology.connected: Graph is disconnected (lambda_2=0.0000,
coordination_mode=orchestrated). 2 agent(s) have no declared
edges: grocery, health.
VERIFY WARN: topology.fault_tolerant: Topology has 5 articulation point(s):
removing any disconnects the graph. Acceptable for hub-and-
spoke designs; flag for fault-tolerant ones.
Formal verification passed (5 checks, 1 warning, 24.6 ms)
The verifier surfaces two real facts. The grocery and health agents have no declared A2A edges with anyone — they're driven independently by the bot, which is correct under orchestrated mode and shows as info. The briefing agent is a hub, which means the design has multiple articulation points — accepted for this app, flagged for review. The publish proceeds.
A future version of this app where briefing is replaced by a peer collaboration pattern would have to opt into a2a mode, at which point the disconnected agents become errors and the gate blocks until the topology is closed.
That's what a gate is supposed to do. Tell you what's in your manifest, what it implies, and whether you've declared the right thing.
Why This Is Structurally Hard to Copy
Agent frameworks proliferate. Within twelve months, every major framework will have observability, cost tracking, multi-provider routing, marketplace primitives. Those are well-understood problems with clear engineering solutions.
Formal verification is different. The library underneath the gate has been research-active for ~40 years across academia and a handful of industrial verification groups. Process algebra, multiparty session types, model checking, spectral graph theory — these are not pieces you pick up in a sprint. Building them from scratch is a multi-year investment for a team with the right background; using them well requires understanding what they prove and what they don't.
We packaged ours as Apache 2.0 (github.com/agenticraft/agenticraft-foundation) because we think the field is better off with formal methods in agent infrastructure becoming table stakes. The library is the easy part to share. The infrastructure pattern — making it a hard gate, with operator-declared intent, with severity that escalates structurally rather than arbitrarily — is what makes it usable, and that takes opinion as much as engineering.
Try It
The library is on PyPI:
uv add agenticraft-foundation # or: pip install agenticraft-foundationThe CLI verifies a manifest from the command line:
pip install agenticraft-foundation[cli]
python -m agenticraft_foundation app.yaml # human-readable
python -m agenticraft_foundation app.yaml --strict # enterprise gate
python -m agenticraft_foundation app.yaml --json # for CI log processorsExit codes are conventional: 0 passed, 1 failed verification, 2 input error. The full check matrix and contract tests live in the foundation repo.
If you publish to the AgentiCraft marketplace, the gate runs automatically. You don't have to think about it. That's the point.