Summarize 1.2M care calls a month
The batch discount is the go/no-go on the whole program.
Open the live lab · pre-loaded to this scenario
Prompt Cost & Token Simulator
Context
A telecom summarizes ~1.2M care-center calls a month for QA and coaching — an offline, non-urgent workload. Because it's async, most of it can run through the batch API at a steep discount.
The decision
Batching is the go/no-go: at this volume the batch discount is what moves the program from 'too expensive' to fundable — the workload is async, so there's no reason to pay real-time rates.
What most miss
People price offline workloads at real-time rates. Summarization is async; routing it through batch is money left on the table if you don't.
Stakes
Priced at real-time rates, a 1.2M-call summarization program doesn't clear its business case.
First-hand · Agent & Protocol · verified 2026-07-03
Sources: Telecom care-center call summarization at scale — first-hand (Verizon); Batch-inference discount economics