Agent & Protocol · Toolkit

Prompt Cost & Token Simulator

SIMULATEDVerified Jul 2, 2026· data as of Jul 2, 2026

Architecture debates stall on taste; unit economics settle them. Size a single call, set the volume, and the annual number is the build-vs-buy conversation — before anyone draws a box.

Same instrument · three industries — pick a use-case to reconfigure the run

Model

Prompt

76 input tokens

Rough estimate (~4 chars/token). Paste a real prompt to size it.

400
5,000
60%
Tokens / call
476

76 in · 400 out

Cost / call
$0.0016

Claude Haiku 4.5

Monthly run-rate
$244

150,000 calls/mo

Annual run-rate
$2,930

At current pricing

Where the money goes

Input$50
Output$2,880
Saved by caching + batching$59

Caching + batching cut 2% — $59 / year

Before leverage this workload runs $2,989/year; after, $2,930. The static context you send on every call is the lever — cache it and the input line collapses.

Steering-committee takeaway: Unit economics decide build-vs-buy long before architecture does. Size the call, then argue the design.

How this is built

Stack: Next.js (static) + shared design system; pure client-side arithmetic.

Pricing lives in a dated config (`@labs/kit`, as of 2026-07-02) — never in copy — with a per-model cache-read price. Tokens are estimated at ~4 chars/token.

Cost/call = input tokens × input price + output tokens × output price. Caching reprices the cacheable share of input at the cache-read rate; batching applies a 50% discount to the eligible share.

Limitations: token estimate is approximate (not a real tokenizer); pricing is published list price, not a negotiated rate; excludes retries, tool-call round-trips, and egress. It sizes the decision, not the invoice.