API v1

API Documentation

Calculate LLM token costs programmatically. Perfect for integrating cost estimation into your applications, CI/CD pipelines, or monitoring dashboards. No API key required.

Rate Limits

All endpoints are rate limited to 100 requests per minute per IP address. Rate limit headers are included in every response: X-RateLimit-Limit,X-RateLimit-Remaining,X-RateLimit-Reset.

Base URL

https://tokenbudget.edwardiaz.dev/api/v1

Endpoints

POST/api/v1/calculate

Calculate the cost for a specific model given input, output, and optionally cached tokens.

Request Body

json

{
  "model": "gpt-4o",
  "inputTokens": 1000,
  "outputTokens": 500,
  "cachedTokens": 200  // optional - tokens served from cache
}

Response

json

{
  "model": "gpt-4o",
  "provider": "OpenAI",
  "inputTokens": 1000,
  "outputTokens": 500,
  "cachedTokens": 200,
  "costs": {
    "input": 0.002,
    "output": 0.005,
    "cached": 0.00025,
    "total": 0.00725
  },
  "prices": {
    "inputPer1M": 2.5,
    "outputPer1M": 10.0,
    "cachedPer1M": 1.25
  }
}

cURL Example

bash

curl -X POST https://tokenbudget.edwardiaz.dev/api/v1/calculate \
  -H "Content-Type: application/json" \
  -d '{"model":"gpt-4o","inputTokens":1000,"outputTokens":500,"cachedTokens":200}'

Try it live(runs from your browser)

model

inputTokens

outputTokens

cachedTokens

POST/api/v1/best

Find the cheapest model for your token usage, with optional filters for capabilities and providers.

Request Body

json

{
  "inputTokens": 1000,
  "outputTokens": 500,
  "cachedTokens": 0,
  "filters": {
    "providers": ["OpenAI", "Anthropic"],  // optional
    "minContext": 128000,                   // optional
    "capabilities": ["vision", "functions"] // optional
  }
}

Response

json

{
  "recommendation": {
    "model": "gpt-4o-mini",
    "provider": "OpenAI",
    "costs": {
      "input": 0.00015,
      "output": 0.0003,
      "cached": 0,
      "total": 0.00045
    },
    "contextWindow": 128000,
    "capabilities": {
      "vision": true,
      "functions": true,
      "reasoning": false
    }
  },
  "alternatives": [
    {
      "model": "claude-3-5-haiku",
      "provider": "Anthropic",
      "totalCost": 0.0028,
      "priceDiff": "+522%"
    }
  ]
}

Available Capability Filters

vision - Models that can analyze images
functions - Models that support function calling
reasoning - Models with enhanced reasoning (o1, o3, etc.)
coding - Models optimized for code generation

Try it live(runs from your browser)

inputTokens

outputTokens

cachedTokens

GET/api/v1/models

List all available models with their pricing. Supports filtering by provider and caching capability.

Query Parameters

provider - Filter by provider (e.g., openai, anthropic)
hasCache - Only show models with cached pricing (true/false)
search - Search by model name

Response

json

{
  "count": 65,
  "providers": ["OpenAI", "Anthropic", "Google", ...],
  "models": [
    {
      "id": "gpt-4o",
      "provider": "OpenAI",
      "input": 2.5,
      "output": 10.0,
      "cached": 1.25,
      "context": 128000
    },
    ...
  ],
  "lastUpdated": "2026-03-20T12:00:00.000Z"
}

Example

bash

curl "https://tokenbudget.edwardiaz.dev/api/v1/models?provider=openai&hasCache=true"

Try it live(runs from your browser)

provider

hasCache

Error Handling

All error responses follow a consistent format:

json

{
  "error": "Rate limit exceeded",
  "message": "Too many requests. Please try again later.",
  "retryAfter": 45  // seconds until rate limit resets
}

HTTP Status Codes

200 - Success
400 - Bad request (invalid parameters)
404 - Model not found
429 - Rate limit exceeded
500 - Internal server error

Performance

Cached Pricing Data: Model prices are cached and refreshed hourly from LiteLLM, ensuring fast response times (~50ms).
Edge Deployment: API runs on Vercel Edge Functions for low latency globally.
Minimal Payload: Responses are optimized to include only essential data, keeping bandwidth low.

Use Cases

CI/CD Integration

Add cost checks to your deployment pipeline to prevent expensive model regressions.

Cost Monitoring

Log token usage and calculate costs in real-time for your dashboard metrics.

Ready to get started?

No API key required. Start making requests right away.

Try the Calculator