Now onboarding a small group of design-partner teams
An independent auditor for AI spend

You can already see what your AI costs. Skalpel shows what it produced — and what it can't.

The independent, cross-vendor auditor of AI spend. Built to reconcile your spend to the provider invoice, tie it to real outcomes — cost per merged PR, per resolved ticket — and show the share it honestly can't.

“What did all that AI spend actually produce — per merged PR, per resolved ticket, net of what it cost, across all our vendors, at a confidence I can defend in a board meeting?”
Reconciliation
Anthropic $48,210.18 $48,955.40
OpenAI $21,740.06 $21,512.88
Bedrock $9,330.51 $9,401.20
Target ± 2% not yet validated against a live invoice
Attribution ρ∅ 0.25
Verified 41% Inferred 34% Unattributed 25%

By construction, the verified, inferred, and unattributed shares always sum to the full provider-billed invoice, enforced in code. The unattributed slice is shown full-size: never hidden, never rounded away.

The problem

You can invoice the spend. You can't defend the return.

Every company now spends real money on LLMs — usually across several providers at once, and AI spend has only now crossed the line into something a board asks about. Three facts leave you unable to answer for the spend:

01 Scattered

The spend is real, multi-vendor, and growing — but scattered.

Each provider has its own dashboard. Nothing ties them together, or to your own results.

02 Commoditized

Seeing the cost is already a solved, commoditized problem.

Every observability and cloud-cost tool can show “this API call cost $0.04.” That's table stakes — Skalpel never leads with it.

03 Dishonest

Seeing the return is unsolved — and what passes for solving it is dishonest.

Competitors round to 100% and dress a guess up as a fact. No one — us included — has clean causal proof, so the honest move is to show the share you can't attribute, not bury it.

How it works

Two layers on one reconciled foundation — and an honest remainder on top.

Variance to invoice
Δ 0.0% within the ± 2% tolerance band
01

Reconcile the spend

Meter AI usage across providers and reconcile it to the actual provider invoice within a tight tolerance (target ± 2%, validated per pilot). “Reconcile” in the accounting sense: tie the measured number out to the authoritative bill, not estimate it from logs. Built to be finance-grade — and by design, a dollar figure isn't meant to render until a reconciliation passes.

Cost per outcome
$0 / merged PR
reconciled spend ÷ shipped outcomes
02

Attribute to outcomes

Designed to connect reconciled spend to real business outcomes — starting with merged PRs and resolved tickets — at team and workflow granularity. Never at the individual-employee level; that line is non-negotiable.

Attribution ρ∅ 0.00
Verified Inferred Unattributed — shown in full
03

Show what you can't explain

Every number is tagged verified or inferred and the two are held apart. The share that can't be honestly attributed is shown in full — a mandatory unattributed bucket, not a rounding error.

Why this compounds

A number that survives the board and the auditor — because it shows you its own limits.

Every other tool selling “AI ROI” rounds to 100% and hands you a guess dressed as a fact. Skalpel won't. Two guarantees are enforced in code, not promised in a footnote:

Verified and inferred can never blend.

Every figure is typed at the source — a closed chain of shared identifiers, or an inferred link — and the two are held apart in code, not by a disclaimer. No one can quietly average a guess and a fact into a rounder, prettier total.

What it can't explain stays on screen.

The exact share Skalpel can't honestly attribute is shown in full — never rounded away, never buried under “other.” A tool that claims to explain all your AI spend is lying; this one shows you the gap — the most honest number on the page.

// the honest version

We looked at eleven tools selling “AI ROI.” Not one is honest about what it can't see:

  • No reconciliation tolerance to the invoice
  • No line between a measured number and a guessed one
  • No honest remainder — they round to 100% by construction

And it compounds: the cross-vendor reconciliation graph gets more accurate — and more switching-costly — every invoice cycle.

Numbers built to look complete. This one's built to be defensible.

± 2% Anthropic OpenAI Bedrock merged PR resolved ticket one reconciliation graph across every vendor
What you'll get

The board-defensible numbers a cost dashboard can't produce.

Verified
$0 / merged PR

Cost per merged PR / resolved ticket

Reconciled spend over real shipped outcomes — the headline ratio. Target tier, wired when the §16 connecting step lands.

Inferred
+$0k net of cost · estimated

Net-of-cost value (estimated)

Reconciled spend subtracted from the value credited to it — reported as a wide-banded estimate, with its assumptions shown.

Inferred
payments
web
infra
data

Low-yield hotspots

Where AI spend runs ahead of the outcomes it's credited with — the places to look first.

Inferred
frontier 52% over-provisioned

Model-mix efficiency

Where you're paying for a frontier model on a task a cheaper one would close.

Inferred
eng
data
ops
web

AI-leverage by team

The share of shipped work that was AI-assisted, at team and workflow granularity.

Inferred
0% of loaded team cost

AI spend as a share of loaded team cost

The comparison the board actually asks for, per team — reported as an estimate, with its assumptions shown.

Who it's for

API-first teams with real, multi-vendor AI spend.

  • You build on LLMs — direct API keys and/or Bedrock, Vertex, Azure OpenAI — not just chat seats.
  • AI spend is material and multi-vendor — roughly $10k–$1M+/month across two or more providers (or adding a second) — i.e. large enough that the number has to be defensible, not estimated.
  • Someone internal has been asked “what's our AI ROI / where's the waste” and couldn't answer cleanly.
The buyer

The buyer is whoever owns AI spend and its return — Head of Engineering / Platform / Infra, a CTO or technical co-founder, an emerging FinOps-for-AI owner, sometimes a Finance partner handed the AI-cost question.

Not a fit (yet)

Not a fit (yet): seat-only orgs with no API keys; single-vendor with trivial spend; pre-product with nothing to attribute.

Pilot

Put your AI spend through an independent reconciliation.

We're onboarding a small set of design-partner teams. Join the waitlist, or book a 20-minute pilot conversation — a candid look at how you reconcile and defend your AI spend today, not a pitch.

Book a pilot call

No spam. We email once, when your pilot slot opens.