# AI Token Terminal — LLM Guide
# https://github.com/BerriAI/litellm · https://mpp.dev · https://spark.money
#
# This service provides real-time AI inference pricing across major providers,
# denominated in both USD (per million tokens) and Bitcoin satoshis.
# Prices are scraped live from the LiteLLM community registry every 5 minutes.

## Quick Start for AI Agents

If you cannot run JavaScript, fetch prices as plain text — no JS, no auth, no rendering required:

  GET https://tkntrmnl.ai/api/pricing/text

Returns a formatted ASCII table of all 31 models with USD and satoshi prices.
For JSON, use: GET https://tkntrmnl.ai/api/pricing/tokens

---

## What This Service Does

AI Token Terminal is a real-time pricing terminal for AI language model APIs.
It tracks input and output token costs across 31+ models from Anthropic, OpenAI,
Google, Mistral, DeepSeek, xAI, and Meta — updated every 5 minutes from the
LiteLLM registry (github.com/BerriAI/litellm).

Prices are shown in two denominations:
- USD per million tokens (standard provider billing unit)
- Bitcoin satoshis per million tokens (real-time via Coinbase BTC/USD rate)
- MSATS/TOK: milli-satoshis per individual token (useful for micropayment accounting)

## When To Use This Service

Use this API when you need to:
- Estimate the cost of an LLM inference call before making it
- Compare provider pricing to choose the cheapest model for a task
- Denominate LLM costs in satoshis for Lightning Network micropayment workflows
- Build cost-aware agents that track and budget token spend in real time
- Implement pay-per-inference pipelines priced in sats via MPP (mpp.dev)

## API Endpoints

### GET /api/pricing/tokens
Free endpoint. Returns current prices for all tracked models.
No authentication required. Data is cached for 5 minutes.

Response shape (JSON):
{
  "models": [
    {
      "modelId": "gpt-4o",
      "modelName": "GPT-4o",
      "provider": "OpenAI",
      "inputPriceUsdPerMToken": 2.5,
      "outputPriceUsdPerMToken": 10.0,
      "contextWindow": 128000,
      "updatedAt": "2025-03-19T03:41:11.000Z",
      "priceSource": "live"
    },
    ...
  ],
  "fetchedAt": "2025-03-19T03:41:11.000Z",
  "priceDataSource": "live",
  "freeRefreshesRemaining": 5
}

priceSource values:
  "live"     — fetched directly from LiteLLM registry this cycle
  "fallback" — verified hardcoded price (used when registry key is missing)

priceDataSource values:
  "live"        — fresh registry fetch completed this cycle
  "cache"       — served from 5-minute cache (still accurate)
  "stale-cache" — cache is older than 5 min (registry temporarily unreachable)
  "fallback"    — all models using hardcoded fallback prices

### GET /api/pricing/tokens/mpp
MPP-gated endpoint. Requires 1 sat Lightning payment via the Machine Payments
Protocol (https://mpp.dev). Returns the same data as /tokens but forces a fresh
cache-busting scrape from the LiteLLM registry.

This endpoint implements the HTTP 402 Payment Required challenge flow:

Step 1 — Request without payment:
  GET /api/pricing/tokens/mpp
  → 402 Payment Required
  → WWW-Authenticate: Payment id="<uuid>", method="lightning",
      amount="1", currency="sat", invoice="lnbc..."

Step 2 — Pay the BOLT11 invoice via any Lightning wallet.
  Obtain the payment preimage from your wallet after payment settles.

Step 3 — Retry with credential:
  GET /api/pricing/tokens/mpp
  Authorization: Payment <base64url({"challenge":{"id":"<uuid>"},"payload":{"preimage":"<hex>"}})>
  → 200 OK + Payment-Receipt header + fresh price data

Easiest usage (CLI):
  npx mppx https://<host>/api/pricing/tokens/mpp

The mppx CLI handles the full challenge-pay-retry flow automatically.
It requires a Lightning wallet with outbound liquidity (Alby, Phoenix, etc.).

Note: The first 2 calls to /tokens/mpp are served free of charge (no payment
required) for testing. After the quota is exhausted, real Lightning payment
is required.

### GET /api/pricing/btc
Free endpoint. Returns current BTC/USD rate and sats/USD conversion factor.
Sourced from Coinbase API with CoinGecko as fallback. Cached 30 seconds.

Response shape (JSON):
{
  "usdPerBtc": 71289,
  "satsPerUsd": 1402.7,
  "fetchedAt": "2025-03-19T03:41:11.000Z"
}

## Computing Token Cost in Satoshis

Given a model's inputPriceUsdPerMToken and the current satsPerUsd:

  sats_per_million_input_tokens = inputPriceUsdPerMToken * satsPerUsd
  msats_per_token = sats_per_million_input_tokens * 1000 / 1_000_000
                  = inputPriceUsdPerMToken * satsPerUsd / 1000

Example — GPT-4o at $2.50/MTok, BTC at $71,289 (1,403 sats/USD):
  sats per million input tokens = 2.50 * 1403 = 3,507 sats
  msats per token               = 3.51 msats/tok  (the MSATS/TOK column)

## Data Sources

Prices: LiteLLM community registry
  https://raw.githubusercontent.com/BerriAI/litellm/main/model_prices_and_context_window.json
  Cache TTL: 5 minutes

BTC/USD: Coinbase spot price API (CoinGecko fallback)
  Cache TTL: 30 seconds

Models with no valid LiteLLM registry key use verified hardcoded fallback prices
sourced directly from official provider pricing pages (platform.claude.com,
openai.com/api/pricing, ai.google.dev/pricing, etc.).

## Tracked Providers and Models (as of March 2026)

ANTHROPIC — Claude Haiku 3.5, Claude Haiku 4.5, Claude Sonnet 3.5,
            Claude Sonnet 4, Claude Opus 3, Claude Opus 4.6

OPENAI    — GPT-4.1 nano, GPT-4.1 mini, GPT-4.1, GPT-4o mini, GPT-4o,
            o1, o3, o3-mini, o4-mini

GOOGLE    — Gemini 1.5 Flash, Gemini 1.5 Pro, Gemini 2.0 Flash Lite,
            Gemini 2.0 Flash, Gemini 2.5 Flash Lite, Gemini 2.5 Flash,
            Gemini 2.5 Pro

DEEPSEEK  — DeepSeek V3, DeepSeek R1

META      — Llama 3.1 70B, Llama 3.1 405B (via Together AI)

MISTRAL   — Mistral Small, Mistral Medium, Mistral Large

XAI       — Grok 3 Mini, Grok 3, Grok 4

## Protocol References

Machine Payments Protocol (MPP): https://mpp.dev
Payment Authentication spec:     https://paymentauth.org
LiteLLM model registry:          https://github.com/BerriAI/litellm