Live Inference Pricing

Real-time spot rates across 26+ providers. InferLane routes your requests to the cheapest option that meets your quality requirements.

Providers tracked

23

Models indexed

80+

Local routing

$0.00

Gemma 4 via Ollama

Spot Rates

Per 1M tokens · Updated in real time

ModelProviderInputOutputContextTiervs Opus
Claude Sonnet 4.5Anthropic$3.00$15.00200KWorkhorse80%
GPT-4oOpenAI$2.50$10.00128KFrontier87%
Gemini 2.5 ProGoogle$1.25$10.001MFrontier87%
Gemini 2.0 FlashGoogle$0.10$0.401MSpeed99%
DeepSeek V3DeepSeek$0.27$1.1064KBudget99%
Gemma 4 27BDarkbloom$0.06$0.20128KDecentralized100%
Claude Haiku 4.5Anthropic$1.00$5.00200KSpeed93%
Llama 3.3 70BGroq$0.59$0.79128KBudget99%
Gemma 4 12BOllama (local)FreeFree128KFree / Local100%

How InferLane saves you money

Smart routing

Simple tasks (classification, extraction) route to budget models. Complex reasoning stays on frontier models. You get the right quality at the right price, automatically.

Local-first

One command installs Ollama + Gemma 4, auto-sized to your hardware. Simple workloads run free on your machine. Zero API cost, zero latency.

Spot exchange

Operators list spare capacity at competitive rates. Dynamic pricing means you benefit from supply competition — not flat-rate monopoly pricing.

No single provider lock-in

InferLane routes across 26 providers. If one raises prices, goes down, or changes terms — your workloads automatically shift. No vendor lock-in. No single point of failure.

AnthropicOpenAIGoogleMistralDeepSeekGroqTogetherFireworksCerebrasOllama (local)DarkbloomChutes (Bittensor)AkashNosana+12 more

Pricing API

Query live spot rates programmatically. Free, no auth required for read-only pricing data.

# Get spot rate for a model

curl "https://inferlane.dev/api/exchange/spot?model=gemma-4-27b&inputTokens=1000&outputTokens=500"