PRICING
PRICING
PRICING
Join leading companies who trust LiteAPI to power their business
Join leading companies who trust
LiteAPI to power their business
Join leading
companies who
trust LiteAPI to
power their business








Input+output (per 1M tokens)
Input+output (per 1M tokens)
AI Model
Direct Integration
Direct Integration
LiteAPI
Discount
Anthropic/Claude-Sonnet-4.5
$9.00
$9.00
$5.40
40%
Anthropic/Claude-Sonnet-4.5
Anthropic/Claude-Haiku-4.5
$6.00
$3.60
40%
Anthropic/Claude-Haiku-4.5
Anthropic/Claude-Opus-4.1
$90.00
$54.00
40%
Anthropic/Claude-Opus-4
$90.00
$54.00
40%
Anthropic/Claude-Sonnet-4
$18.00
$10.80
40%
Anthropic/Claude-3.7-Sonnet
$18.00
$10.80
40%
Anthropic/Claude-3.7-Sonnet:thinking
$18.00
$10.80
40%
Anthropic/Claude-3.5-Haiku
$4.80
$2.88
40%
Anthropic/Claude-3.5-Sonnet
$18.00
$10.80
40%
Anthropic/Claude-3-Haiku
$1.50
$0.90
40%
Anthropic/Claude-3-Opus
$90.00
$54.00
40%
OpenAI/Gpt-5-Chat
$11.25
$6.75
40%
OpenAI/Gpt-5-Mini
$2.25
$1.35
40%
OpenAI/Gpt-5-Nano
$0.45
$0.27
40%
OpenAI/Gpt-Oss-120b
$0.45
$0.27
40%
OpenAI/Gpt-Oss-20b
$0.17
$0.10
40%
OpenAI/O1
$75.00
$45.00
40%
OpenAI/O1-Mini
$5.50
$3.30
40%
OpenAI/O1-Pro
$750.00
$450.00
40%
OpenAI/O3
$10.00
$6.00
40%
OpenAI/O3-Mini
$5.50
$3.30
40%
OpenAI/O3-Mini-High
$5.50
$3.30
40%
OpenAI/O3-Pro
$100.00
$60.00
40%
OpenAI/O4-Mini
$5.50
$3.30
40%
OpenAI/O4-Mini-High
$5.50
$3.30
40%
OpenAI/Gpt-4.1
$10.00
$6.00
40%
OpenAI/Gpt-4.1-Mini
$2.00
$1.20
40%
OpenAI/Gpt-4.1-Nano
$0.50
$0.30
40%
OpenAI/Gpt-4o
$12.50
$7.50
40%
OpenAI/Gpt-4o-Mini
$0.75
$0.45
40%
Google/Gemini-2.5-Flash
$2.80
$1.68
40%
Google/Gemini-2.5-Flash-Image
$2.80
$1.68
40%
Google/Gemini-2.5-Flash-Lite
$0.50
$0.30
40%
Google/Gemini-2.5-Pro
$11.25
$6.75
40%
Google/Gemini-2.0-Flash-001
$0.50
$0.30
40%
Google/Gemini-2.0-Flash-Lite-001
$0.375
$0.225
40%
Google/Gemma-2-27b-It
$1.30
$0.78
40%
Google/Gemma-2-9b-It
$0.12
$0.072
40%
Google/Gemma-3-12b-It
$0.13
$0.078
40%
Google/Gemma-3-27b-It
$0.25
$0.15
40%
Google/Gemma-3-4b-It
$0.085
$0.051
40%
Google/Gemma-3n-E4b-It
$0.06
$0.036
40%
Input+output (1M tokens)
Input+output (per 1M tokens)
AI Model
AI Model
Direct Integration
LiteAPI
Discount
Anthropic/Claude-Sonnet-4.5
$9.00
$9.00
$5.40
40%
Anthropic/Claude-Sonnet-4.5
Anthropic/Claude-Haiku-4.5
$6.00
$3.60
40%
Anthropic/Claude-Haiku-4.5
Anthropic/Claude-Opus-4.1
$90.00
$54.00
40%
Anthropic/Claude-Opus-4
$90.00
$54.00
40%
Anthropic/Claude-Sonnet-4
$18.00
$10.80
40%
Anthropic/Claude-3.7-Sonnet
$18.00
$10.80
40%
Anthropic/Claude-3.7-Sonnet:
thinking
$18.00
$10.80
40%
Anthropic/Claude-3.5-Haiku
$4.80
$2.88
40%
Anthropic/Claude-3.5-Sonnet
$18.00
$10.80
40%
Anthropic/Claude-3-Haiku
$1.50
$0.90
40%
Anthropic/Claude-3-Opus
$90.00
$54.00
40%
OpenAI/Gpt-5-Chat
$11.25
$6.75
40%
OpenAI/Gpt-5-Mini
$2.25
$1.35
40%
OpenAI/Gpt-5-Nano
$0.45
$0.27
40%
OpenAI/Gpt-Oss-120b
$0.45
$0.27
40%
OpenAI/Gpt-Oss-20b
$0.17
$0.10
40%
OpenAI/O1
$75.00
$45.00
40%
OpenAI/O1-Mini
$5.50
$3.30
40%
OpenAI/O1-Pro
$750.00
$450.00
40%
OpenAI/O3
$10.00
$6.00
40%
OpenAI/O3-Mini
$5.50
$3.30
40%
OpenAI/O3-Mini-High
$5.50
$3.30
40%
OpenAI/O3-Pro
$100.00
$60.00
40%
OpenAI/O4-Mini
$5.50
$3.30
40%
OpenAI/O4-Mini-High
$5.50
$3.30
40%
OpenAI/Gpt-4.1
$10.00
$6.00
40%
OpenAI/Gpt-4.1-Mini
$2.00
$1.20
40%
OpenAI/Gpt-4.1-Nano
$0.50
$0.30
40%
OpenAI/Gpt-4o
$12.50
$7.50
40%
OpenAI/Gpt-4o-Mini
$0.75
$0.45
40%
Google/Gemini-2.5-Flash
$2.80
$1.68
40%
Google/Gemini-2.5-Flash-Image
$2.80
$1.68
40%
Google/Gemini-2.5-Flash-Lite
$0.50
$0.30
40%
Google/Gemini-2.5-Pro
$11.25
$6.75
40%
Google/Gemini-2.0-Flash-001
$0.50
$0.30
40%
Google/Gemini-2.0-Flash-Lite-001
$0.375
$0.225
40%
Google/Gemma-2-27b-It
$1.30
$0.78
40%
Google/Gemma-2-9b-It
$0.12
$0.072
40%
Google/Gemma-3-12b-It
$0.13
$0.078
40%
Google/Gemma-3-27b-It
$0.25
$0.15
40%
Google/Gemma-3-4b-It
$0.085
$0.051
40%
Google/Gemma-3n-E4b-It
$0.06
$0.036
40%
FAQs
Frequently Asked Questions
LiteAPI is an AI aggregation platform first — cost savings are a powerful, but secondary, benefit. When we secure preferred contracts and credits from cloud partners, model providers, and VCs, we pass those inference savings directly to our customers. Because these deals can change over time, discounts are variable and may differ by provider, model, or period. If our underlying cost structure changes, we’ll give at least 30 days’ notice before adjusting prices.
Unlike OpenRouter, LiteAPI focuses solely on production-grade models from OpenAI, Anthropic, and Google—an aggregation layer first, with variable discounts on inference (sometimes up to ~50%) when we secure preferred contracts and VC-backed credits.
Provider training is opt-out by default. Where OpenAI / Anthropic / Google support it, we set the flags so your data is not used to improve their models. All traffic is encrypted with TLS 1.3, and your keys are stored with AES-256 encryption. We never log or store your prompt data. LiteAPI does not use or store your prompts or completions for training or analytics.
Yes. All major model capabilities (text, vision, embeddings, and function calling) are supported where available from the provider.
Yes. Teams spending over $50,000/month on LLM usage can contact us for custom discounts and dedicated support.
Our edge routing adds typically < 15 ms on top of the model latency. For most workloads the cost savings far outweigh this overhead.

FAQs
Frequently Asked Questions
LiteAPI is an AI aggregation platform first — cost savings are a powerful, but secondary, benefit. When we secure preferred contracts and credits from cloud partners, model providers, and VCs, we pass those inference savings directly to our customers. Because these deals can change over time, discounts are variable and may differ by provider, model, or period. If our underlying cost structure changes, we’ll give at least 30 days’ notice before adjusting prices.
Unlike OpenRouter, LiteAPI focuses solely on production-grade models from OpenAI, Anthropic, and Google—an aggregation layer first, with variable discounts on inference (sometimes up to ~50%) when we secure preferred contracts and VC-backed credits.
Provider training is opt-out by default. Where OpenAI / Anthropic / Google support it, we set the flags so your data is not used to improve their models. All traffic is encrypted with TLS 1.3, and your keys are stored with AES-256 encryption. We never log or store your prompt data. LiteAPI does not use or store your prompts or completions for training or analytics.
Yes. All major model capabilities (text, vision, embeddings, and function calling) are supported where available from the provider.
Yes. Teams spending over $50,000/month on LLM usage can contact us for custom discounts and dedicated support.
Our edge routing adds typically < 15 ms on top of the model latency. For most workloads the cost savings far outweigh this overhead.

FAQs
Frequently Asked Questions
LiteAPI is an AI aggregation platform first — cost savings are a powerful, but secondary, benefit. When we secure preferred contracts and credits from cloud partners, model providers, and VCs, we pass those inference savings directly to our customers. Because these deals can change over time, discounts are variable and may differ by provider, model, or period. If our underlying cost structure changes, we’ll give at least 30 days’ notice before adjusting prices.
Unlike OpenRouter, LiteAPI focuses solely on production-grade models from OpenAI, Anthropic, and Google—an aggregation layer first, with variable discounts on inference (sometimes up to ~50%) when we secure preferred contracts and VC-backed credits.
Provider training is opt-out by default. Where OpenAI / Anthropic / Google support it, we set the flags so your data is not used to improve their models. All traffic is encrypted with TLS 1.3, and your keys are stored with AES-256 encryption. We never log or store your prompt data. LiteAPI does not use or store your prompts or completions for training or analytics.
Yes. All major model capabilities (text, vision, embeddings, and function calling) are supported where available from the provider.
Yes. Teams spending over $50,000/month on LLM usage can contact us for custom discounts and dedicated support.
Our edge routing adds typically < 15 ms on top of the model latency. For most workloads the cost savings far outweigh this overhead.

CUT YOUR LLM SPEND BY 40%
CUT YOUR LLM SPEND BY 40%
CUT YOUR LLM SPEND BY 40%



One API. Faster integration. Lower cost.
Redeem $20 API Credit
Get Started - It’s free
