Glossary · Claude Models

Claude Model Tiers

Claude Models 新手

30-Second Version · For the impatient

<a href="/en/glossary/core-concepts/anthropic/">Anthropic</a> designs Claude in three tiers: Haiku (fastest and cheapest), Sonnet (best balance of capability and cost), and Opus (most capable but most expensive). Each suits different task types — it's not 'more expensive is always better.' Sonnet handles 90% of everyday tasks; Haiku suits high-frequency simple tasks; only a small number of tasks requiring extreme deep reasoning need Opus.

Full Explanation +

01 · What is this?

Why did Anthropic design three models instead of one strongest all-purpose model?

Three-tier design isn't a marketing strategy — it reflects fundamental trade-offs in AI system design: balancing compute resources, response speed, and output quality.

More capable models need more compute: Opus 4's reasoning is significantly stronger than Haiku's, but it consumes much more compute — longer response time (2-4 seconds vs 200-400ms), higher cost (~20×). Using "maximum capability" on "tasks not requiring maximum capability" wastes compute resources.

Most tasks don't need maximum capability: "translate this text to English" doesn't need Opus 4's deep reasoning; "classify these 100 emails into technical/business/personal" achieves 95% accuracy with Haiku.

Three-tier structure lets you configure by need: with the same budget, all-Sonnet serves X users; shifting 60% of simple requests to Haiku and upgrading 5% of hard requests to Opus might serve 3× the users with output quality better matched to each task type.

02 · Why does it exist?

What are the differences between Claude 3 series and Claude 4 series tier gradients? What to watch for when upgrading?

Claude 3 era gradient: Claude 3 Haiku, Sonnet, Opus each had clear capability gaps. Claude 3 Opus was then the most capable; many complex tasks required Opus.

Most important Claude 4 era change: Sonnet 4.5's capability has surpassed Claude 3 Opus. If you're still using Claude 3 Opus, switching to Sonnet 4.5 may give better results at lower cost.

Practical upgrade recommendations: if using claude.ai, you're already on the latest version automatically. If using API, swap claude-3-opus-20240229 for claude-sonnet-4-5 and test for three days. For your most common task types, compare output quality — Sonnet 4.5 is likely sufficient at 1/5 the cost.

Common trap: many developers formed the habit of "Opus for important tasks" in the Claude 3 era. In the Claude 4 era, "try Sonnet first" should be the new default habit.

03 · How does it affect your decisions?

Is there a simple decision framework for quickly choosing between Haiku, Sonnet, and Opus?

Three questions, asked in order:

Question 1: Does this task require reasoning? 'Requires reasoning' means the task isn't just lookup or transformation — it needs analysis, judgment, or multi-step logical derivation. If no reasoning needed (classification, format conversion, keyword extraction) → choose Haiku. If reasoning needed → continue to Question 2.

Question 2: Have you tried Sonnet? Were you satisfied with the result? For most reasoning tasks, try Sonnet 4.5 first. If output quality is satisfying → stay with Sonnet. If Sonnet's output is unsatisfying → continue to Question 3.

Question 3: How high is the cost of getting this wrong? If high-stakes (high-risk legal analysis, critical system architecture design) AND Sonnet genuinely insufficient → upgrade to Opus 4. If stakes are lower → continue optimizing prompts on Sonnet, usually more efficient than upgrading models.

Memory shortcut: simple tasks → Haiku; most tasks → Sonnet; high-stakes + Sonnet insufficient → Opus.

04 · What should you do?

How do you use multiple models in one application (tiered routing)? What are the concrete benefits?

Tiered routing is one of the most effective cost optimization strategies in production: instead of all requests using one model, dynamically route requests to the appropriate model based on complexity.

Basic architecture: Step 1 (Haiku): classify each incoming request using Haiku — is this "simple," "medium," or "high difficulty"? Haiku classification is fast (<500ms) and cheap ($0.0008/call). Step 2 (routing): route based on classification — simple requests handled directly by Haiku; medium to Sonnet; high difficulty to Opus.

Actual benefits: assuming 10,000 daily requests split 60% simple / 35% medium / 5% high difficulty. All-Sonnet cost: ~$30/day. Tiered routing: Haiku handles 60% ≈ $4.8, Sonnet handles 35% ≈ $10.5, Opus handles 5% ≈ $7.5, total ≈ $22.8/day. 24% cost reduction, AND complex tasks get a stronger model.

Implementation notes: classification prompt design matters (avoid misclassifying medium-difficulty tasks as simple); requires additional engineering maintenance (three-model routing logic, error handling); for developers just starting, build features well with one model first, then consider adding tiered routing.

Real-World Example +

A legal tech company evaluating how to use a three-tier model architecture for their contract analysis product:

Their product processes ~500 contracts daily, each following: extract key clauses (classification task) → assess legal risk of each clause (reasoning task) → generate complete risk analysis report (long-form generation task).

Model allocation decisions: "Extract key clauses" — pattern recognition (finding parties, amounts, dates, specific clause types), no legal reasoning needed → Haiku 4.5: fast, cheap, sufficient accuracy. "Assess legal risk" — requires legal knowledge and reasoning (jurisdictional impact of clauses, conflicts between clauses) → Sonnet 4.5 + Extended Thinking first; upgrade to Opus 4 only for cross-border contracts or unusually complex clauses. "Generate analysis report" — long-form generation integrating previous analysis → Sonnet 4.5: good long-form writing quality at reasonable cost.

Results: per-contract processing cost drops from all-Opus $2.40 to ~$0.85, 65% cost reduction, while the step requiring highest quality reasoning (legal risk assessment) actually uses the strongest model.

Diagram

Feel free to share. Please credit the source.

Common Misconceptions +

✕ Misconception 1

× Misconception 1: More expensive models are better at all tasks; always use the most expensive. This is the most outdated belief in the Claude 4 era. Sonnet 4.5's performance gap vs Opus 4 on most everyday tasks is minimal — writing, analysis, code, Q&A, Sonnet 4.5 is usually sufficient. Opus 4's genuine advantages concentrate in the small number of tasks needing very long reasoning chains or extreme precision. 'Use the most expensive model to guarantee the best results' wastes costs without guaranteeing quality.

✕ Misconception 2

× Misconception 2: Haiku is a 'feature-limited version' only suitable for low-capability requirements. Haiku isn't 'discounted Claude' — it's a specialized tool optimized for specific task types. On its designed task types (classification, extraction, format conversion), Haiku's performance gap vs Sonnet is minimal, while being 3-5× faster and 1/4 the cost. Saying Haiku has 'low capability' is like saying a sprinter 'isn't good at weightlifting' — not a capability issue but a task-fit issue.

The Missing Link +

Direct Impact

Three-tier model gradient's core trade-off: flexibility vs complexity. Using one model (all Sonnet) is simplest — no need to think about task classification or maintain routing logic. Using three models is most efficient — lower costs, best model for each task type — but requires more design and engineering investment. For developers just starting, one model is easier; for cost-sensitive production environments, tiered routing savings usually justify engineering investment. Which strategy to choose depends on your priorities: fast launch vs cost optimization.

Ask a Question

Related Terms

Useful Resources

Claude API Status → Model Pricing → Prompt Playground → Token Counter → MCP Servers → LLM Benchmarks → Model Comparison →