Real-time spot rates across 26+ providers. InferLane routes your requests to the cheapest option that meets your quality requirements.
Providers tracked
23
Models indexed
80+
Local routing
$0.00
Gemma 4 via Ollama
Per 1M tokens · Updated in real time
| Model | Provider | Input | Output | Context | Tier | vs Opus |
|---|---|---|---|---|---|---|
| Claude Sonnet 4.5 | Anthropic | $3.00 | $15.00 | 200K | Workhorse | 80% |
| GPT-4o | OpenAI | $2.50 | $10.00 | 128K | Frontier | 87% |
| Gemini 2.5 Pro | $1.25 | $10.00 | 1M | Frontier | 87% | |
| Gemini 2.0 Flash | $0.10 | $0.40 | 1M | Speed | 99% | |
| DeepSeek V3 | DeepSeek | $0.27 | $1.10 | 64K | Budget | 99% |
| Gemma 4 27B | Darkbloom | $0.06 | $0.20 | 128K | Decentralized | 100% |
| Claude Haiku 4.5 | Anthropic | $1.00 | $5.00 | 200K | Speed | 93% |
| Llama 3.3 70B | Groq | $0.59 | $0.79 | 128K | Budget | 99% |
| Gemma 4 12B | Ollama (local) | Free | Free | 128K | Free / Local | 100% |
Simple tasks (classification, extraction) route to budget models. Complex reasoning stays on frontier models. You get the right quality at the right price, automatically.
One command installs Ollama + Gemma 4, auto-sized to your hardware. Simple workloads run free on your machine. Zero API cost, zero latency.
Operators list spare capacity at competitive rates. Dynamic pricing means you benefit from supply competition — not flat-rate monopoly pricing.
InferLane routes across 26 providers. If one raises prices, goes down, or changes terms — your workloads automatically shift. No vendor lock-in. No single point of failure.
Query live spot rates programmatically. Free, no auth required for read-only pricing data.
# Get spot rate for a model
curl "https://inferlane.dev/api/exchange/spot?model=gemma-4-27b&inputTokens=1000&outputTokens=500"