An independent auditor for AI spend

You can already see what your AI costs. Skalpel shows what it produced — and what it can't.

The independent, cross-vendor auditor of AI spend. Built to reconcile your spend to the provider invoice, tie it to real outcomes — cost per merged PR, per resolved ticket — and show the share it honestly can't.

Join the waitlist Book a pilot call

“What did all that AI spend actually produce — per merged PR, per resolved ticket, net of what it cost, across all our vendors, at a confidence I can defend in a board meeting?”

Reconciliation

Anthropic	$48,210.18	$48,955.40
OpenAI	$21,740.06	$21,512.88
Bedrock	$9,330.51	$9,401.20

Target ± 2% not yet validated against a live invoice

Attribution ρ∅ 0.25

Verified 41% Inferred 34% Unattributed 25%

By construction, the verified, inferred, and unattributed shares always sum to the full provider-billed invoice, enforced in code. The unattributed slice is shown full-size: never hidden, never rounded away.

The problem

You can invoice the spend. You can't defend the return.

Every company now spends real money on LLMs — usually across several providers at once, and AI spend has only now crossed the line into something a board asks about. Three facts leave you unable to answer for the spend:

01 Scattered

The spend is real, multi-vendor, and growing — but scattered.

Each provider has its own dashboard. Nothing ties them together, or to your own results.

02 Commoditized

Seeing the cost is already a solved, commoditized problem.

Every observability and cloud-cost tool can show “this API call cost $0.04.” That's table stakes — Skalpel never leads with it.

03 Dishonest

Seeing the return is unsolved — and what passes for solving it is dishonest.

Competitors round to 100% and dress a guess up as a fact. No one — us included — has clean causal proof, so the honest move is to show the share you can't attribute, not bury it.

How it works

Two layers on one reconciled foundation — and an honest remainder on top.

Variance to invoice

Δ 0.0% within the ± 2% tolerance band

Reconcile the spend

Meter AI usage across providers and reconcile it to the actual provider invoice within a tight tolerance (target ± 2%, validated per pilot). “Reconcile” in the accounting sense: tie the measured number out to the authoritative bill, not estimate it from logs. Built to be finance-grade — and by design, a dollar figure isn't meant to render until a reconciliation passes.

Cost per outcome

$0 / merged PR

reconciled spend ÷ shipped outcomes

Attribute to outcomes

Designed to connect reconciled spend to real business outcomes — starting with merged PRs and resolved tickets — at team and workflow granularity. Never at the individual-employee level; that line is non-negotiable.

Attribution ρ∅ 0.00

Verified Inferred Unattributed — shown in full

Show what you can't explain

Every number is tagged verified or inferred and the two are held apart. The share that can't be honestly attributed is shown in full — a mandatory unattributed bucket, not a rounding error.

Why this compounds

A number that survives the board and the auditor — because it shows you its own limits.

Every other tool selling “AI ROI” rounds to 100% and hands you a guess dressed as a fact. Skalpel won't. Two guarantees are enforced in code, not promised in a footnote:

Verified and inferred can never blend.

Every figure is typed at the source — a closed chain of shared identifiers, or an inferred link — and the two are held apart in code, not by a disclaimer. No one can quietly average a guess and a fact into a rounder, prettier total.

What it can't explain stays on screen.

The exact share Skalpel can't honestly attribute is shown in full — never rounded away, never buried under “other.” A tool that claims to explain all your AI spend is lying; this one shows you the gap — the most honest number on the page.

// the honest version

We looked at eleven tools selling “AI ROI.” Not one is honest about what it can't see:

No reconciliation tolerance to the invoice
No line between a measured number and a guessed one
No honest remainder — they round to 100% by construction

And it compounds: the cross-vendor reconciliation graph gets more accurate — and more switching-costly — every invoice cycle.

Numbers built to look complete. This one's built to be defensible.

one reconciliation graph across every vendor

What you'll get

The board-defensible numbers a cost dashboard can't produce.

Verified

$0 / merged PR

Cost per merged PR / resolved ticket

Reconciled spend over real shipped outcomes — the headline ratio. Target tier, wired when the §16 connecting step lands.

Inferred

+$0k net of cost · estimated

Net-of-cost value (estimated)

Reconciled spend subtracted from the value credited to it — reported as a wide-banded estimate, with its assumptions shown.

Inferred

payments

web

infra

data

Low-yield hotspots

Where AI spend runs ahead of the outcomes it's credited with — the places to look first.

Inferred

frontier 52% over-provisioned

Model-mix efficiency

Where you're paying for a frontier model on a task a cheaper one would close.

Inferred

eng

data

ops

web

AI-leverage by team

The share of shipped work that was AI-assisted, at team and workflow granularity.

Inferred

0% of loaded team cost

AI spend as a share of loaded team cost

The comparison the board actually asks for, per team — reported as an estimate, with its assumptions shown.

Who it's for

API-first teams with real, multi-vendor AI spend.

You build on LLMs — direct API keys and/or Bedrock, Vertex, Azure OpenAI — not just chat seats.
AI spend is material and multi-vendor — roughly $10k–$1M+/month across two or more providers (or adding a second) — i.e. large enough that the number has to be defensible, not estimated.
Someone internal has been asked “what's our AI ROI / where's the waste” and couldn't answer cleanly.

The buyer

The buyer is whoever owns AI spend and its return — Head of Engineering / Platform / Infra, a CTO or technical co-founder, an emerging FinOps-for-AI owner, sometimes a Finance partner handed the AI-cost question.

Not a fit (yet)

Not a fit (yet): seat-only orgs with no API keys; single-vendor with trivial spend; pre-product with nothing to attribute.

Pilot

Put your AI spend through an independent reconciliation.

We're onboarding a small set of design-partner teams. Join the waitlist, or book a 20-minute pilot conversation — a candid look at how you reconcile and defend your AI spend today, not a pitch.

Book a pilot call