The numbers vendors don’t put in their pricing pages.

AI deployment costs aren’t just API calls. They’re infrastructure, maintenance, data preparation, and the hidden tax of context window inflation.

Key Concepts

  • TCO — Total Cost of Ownership: the real numbers
  • Cost Overrun — When the $2K/month bill becomes $18K
  • Scope Creep — How “just one more feature” destroys budgets

Tool Economics

  • Ollama — Free software; DevOps time and hardware are the real costs
  • vLLM — Self-hosting breaks even at roughly 11B tokens/month; below that, managed APIs win
  • LiteLLM — Cost routing sends cheap queries to cheaper models; virtual keys cap per-team spend

The Invisible Costs

  • Vendor Lock-In — Migration costs belong in your TCO calculation; Zapier → n8n migrations run 2-10 weeks
  • Scope Creep — How “just one more feature” turns a $2K/month deployment into an $18K one
  • TCO — The 40-30-20-10 rule: roughly 40% infrastructure, 30% engineering, 20% maintenance, 10% data

Cost Overrun Patterns

  • Cost Overrun — Retry loops, runaway agents, and context window inflation are the most common billing surprises

AI deployment costs aren’t just API calls. Infrastructure, maintenance, data preparation, and the hidden tax of context window inflation are where budgets actually go.