Catalog

Search the API model catalog directly.

Use the catalog when you already know the constraints you care about and just need a fast shortlist.

11 models matched. Showing page 1 of 1.
DeepSeekCheapest
DeepSeek Chat

Low-cost reasoning-oriented API model with strong price-performance for coding workloads.

Pricing: Official LiveBenchmarks: Third-Party
Synced within 7d
Input$0.28/1M
Output$0.42/1M
Context128K tokens
Latency980 ms
Coveragefull
Google GeminiCheapest
Gemini 2.5 Flash

Google’s high-speed API model for lower-latency product flows and agents.

Pricing: Official LiveBenchmarks: Third-Party
Synced within 7d
Input$0.30/1M
Output$2.5/1M
Context1M tokens
Latency520 ms
Coveragefull
xAICheapest
Grok 3 Mini

Cost-efficient xAI reasoning model positioned for speed and general agent tasks.

Pricing: Official LiveBenchmarks: Third-Party
Synced within 7d
Input$0.30/1M
Output$0.50/1M
Context131.1K tokens
Latency610 ms
Coveragefull
OpenAICheapest
GPT-4.1 mini

Cost-efficient OpenAI model for high-throughput agent and app workloads.

Pricing: Official LiveBenchmarks: Third-Party
Synced within 7d
Input$0.40/1M
Output$1.6/1M
Context1M tokens
Latency780 ms
Coveragefull
MistralCheapest
Mistral Large

High-capability Mistral model for enterprise-grade reasoning and multilingual output.

Pricing: Official LiveBenchmarks: Third-Party
Synced within 7d
Input$0.50/1M
Output$1.5/1M
Context128K tokens
Latency920 ms
Coveragefull
GroqFastest
Llama 3.3 70B Versatile

Groq-hosted fast serving tier for Meta’s large open model family.

Pricing: Official LiveBenchmarks: Third-Party
Synced within 7d
Input$0.59/1M
Output$0.79/1M
Context131.1K tokens
Latency290 ms
Coveragefull
AnthropicCheapest
Claude Haiku 3.5

Fast Claude model for lightweight app flows and retrieval-heavy workloads.

Pricing: Official LiveBenchmarks: Third-Party
Synced within 7d
Input$0.80/1M
Output$4/1M
Context200K tokens
Latency640 ms
Coveragefull
Google GeminiSmartest
Gemini 2.5 Pro

Google’s top-tier reasoning and multimodal-capable API model for complex tasks.

Pricing: Official LiveBenchmarks: Third-Party
Synced within 7d
Input$1.25/1M
Output$10/1M
Context1M tokens
Latency1500 ms
Coveragefull
OpenAISmartest
GPT-4.1

Flagship general-purpose model tuned for production-grade reasoning and tool use.

Pricing: Official LiveBenchmarks: Third-Party
Synced within 7d
Input$2/1M
Output$8/1M
Context1M tokens
Latency1400 ms
Coveragefull
CohereFastest
Command A

Cohere’s enterprise-oriented generation model tuned for agentic workflows and RAG.

Pricing: Official LiveBenchmarks: Third-Party
Synced within 7d
Input$2.5/1M
Output$10/1M
Context128K tokens
Latency1010 ms
Coveragefull
AnthropicCoding
Claude Sonnet 4

Balanced Claude model optimized for production coding and long-context tasks.

Pricing: Official LiveBenchmarks: Third-Party
Synced within 7d
Input$3/1M
Output$15/1M
Context200K tokens
Latency1200 ms
Coveragefull
PreviousNext