Causal meters: per-node cost attribution for agent workloads

Yethikrishna R · Mynd Labs Runtime Group✦2026-02-11

Abstract

Agent runs spend money in ways invoices cannot explain: a retry storm in one tool adapter shows up as a vague increase in monthly model spend. We describe the runtime's cost-attribution scheme, which meters every token, tool call, and retry at the execution-graph node that caused it — distinguishing intrinsic cost from induced cost — and aggregates the result into per-feature unit economics that survive concurrency and caching.

Why run-level accounting fails

Billing a whole run to the feature that started it makes two distinct failures invisible. Shared work is double-charged: a context fetch reused by three nodes appears in three features' costs or none. And induced cost is misassigned: when a flaky adapter forces upstream re-planning, the planner's extra tokens get billed to planning, though the adapter caused them.

The meter

Because every run is a deterministic graph (R-001), attribution can be structural. Each edge carries a meter recording direct spend — tokens, tool fees, wall-clock — and a causal tag naming the node whose demand created the edge. Retries carry the tag of the failing node, not the retrying one. Shared sub-graphs are metered once and amortised across their consumers by marginal use, so caching shows up as falling attributed cost rather than accounting noise.

meter run-84f2
  node          direct    induced   total
  plan          $0.0041   $0.0007   $0.0048
  cal_adapter   $0.0002   $0.0119   $0.0121  ← retry storm
  brief         $0.0089   —         $0.0089
  charge(feature=morning-brief) = Σ = $0.0258

What it changed

The first month of causal metering reordered our optimisation queue. The most expensive feature by invoice was the cheapest by intrinsic cost — its spend was induced by one adapter's timeout policy, fixed in an afternoon for a 38% reduction. Per-feature unit economics now appear in the same monthly review as retention, which is the point: an ecosystem sequenced on margin gates needs cost numbers as trustworthy as its usage numbers.

Open problem: attribution across speculative execution, where the runtime races two plans and discards one. Charging abandoned work to the feature is honest but discourages a useful optimisation; we currently book it to a runtime overhead account and are not satisfied with that answer.

cite as: Mynd Labs Research Note R-003 (2026)

Next note

[R-002]Answering without exporting: a query model for context-graph privacyContext-graph privacy