Glossary · Core Concepts

Extended Thinking

Q: Why does Extended Thinking matter?

**What goes wrong if I set the thinking budget too high?** Two things: time and money. Thinking tokens are billed like output tokens, so a higher budget means potentially more cost — though it is a ceiling, not a guaranteed spend. More immediately, latency: Claude outputs only after finishing thinking, so a higher budget means longer waits. A complex task with a 100K- Token budget can take several times longer than the same question without Extended Thinking . Size the budget to difficulty: quick questions need no extended thinking; moderate analysis or coding tasks get a few thousand tokens; true multi-step math or long reasoning chains get tens of thousands. Maxing the budget for everything slows fast tasks and costs more unnecessarily.

Q: How does Extended Thinking work?

**What observable differences does Extended Thinking actually make?** The most direct effect is more complete structure and layered reasoning. With Extended Thinking on, Claude is more likely to decompose a complex question into sub-problems, reason through each, then integrate — rather than jumping straight to an answer or slipping at a plausible-sounding but logically skipped step. Another difference shows up on problems with traps. If a question has a non-obvious twist, a model without thinking time is more likely to hit the common error; a model with scratch space is more likely to catch the wrong path during reasoning and self-correct before the final answer. The difference grows with problem complexity and number of steps.

Q: How is Extended Thinking applied in practice?

**If I can't see the thinking, how do I know Claude is actually reasoning?** You can't directly inspect the thinking, but there are indirect signals. First, the structure and depth of the final answer: if the output has clear step decomposition, handles edge cases, and explains why certain approaches don't work, the thinking was usually doing something useful. Second, some API configurations allow streaming the thinking or showing a summary for spot-checking. A more practical method: run the same complex question with and without Extended Thinking and compare output quality and error rate. If your task type is consistent, a few comparisons build intuition about how much budget pays off for that type.

Core Concepts Intermediate

30-Second Version · For the impatient

<a href="/en/glossary/workspace-basics/extended-thinking/" target="_blank">Extended Thinking</a> is Claude's ability to work through reasoning in a hidden scratchpad before delivering its final answer. You set a thinking budget (a <a href="/en/glossary/core-concepts/token/">Token</a> ceiling), and Claude reasons freely within it — exploring, backtracking — before producing the output you see. The thinking itself is usually not visible, but it lets Claude handle complex problems with deeper, more structured reasoning, like working out scratch paper before speaking.

Full Explanation +

01 · What is this?

What goes wrong if I set the thinking budget too high?

Two things: time and money. Thinking tokens are billed like output tokens, so a higher budget means potentially more cost — though it is a ceiling, not a guaranteed spend. More immediately, latency: Claude outputs only after finishing thinking, so a higher budget means longer waits. A complex task with a 100K-Token budget can take several times longer than the same question without Extended Thinking.

Size the budget to difficulty: quick questions need no extended thinking; moderate analysis or coding tasks get a few thousand tokens; true multi-step math or long reasoning chains get tens of thousands. Maxing the budget for everything slows fast tasks and costs more unnecessarily.

02 · Why does it exist?

What observable differences does Extended Thinking actually make?

The most direct effect is more complete structure and layered reasoning. With Extended Thinking on, Claude is more likely to decompose a complex question into sub-problems, reason through each, then integrate — rather than jumping straight to an answer or slipping at a plausible-sounding but logically skipped step.

Another difference shows up on problems with traps. If a question has a non-obvious twist, a model without thinking time is more likely to hit the common error; a model with scratch space is more likely to catch the wrong path during reasoning and self-correct before the final answer. The difference grows with problem complexity and number of steps.

03 · How does it affect your decisions?

If I can't see the thinking, how do I know Claude is actually reasoning?

You can't directly inspect the thinking, but there are indirect signals. First, the structure and depth of the final answer: if the output has clear step decomposition, handles edge cases, and explains why certain approaches don't work, the thinking was usually doing something useful. Second, some API configurations allow streaming the thinking or showing a summary for spot-checking.

A more practical method: run the same complex question with and without Extended Thinking and compare output quality and error rate. If your task type is consistent, a few comparisons build intuition about how much budget pays off for that type.

04 · What should you do?

Advanced: when does Extended Thinking actually not help?

A few genuine edge cases. First, the task is knowledge retrieval rather than reasoning: if you're asking for a directly look-up-able fact, thinking space doesn't help Claude know things outside its training; it only delays output. Second, the task needs current information: Extended Thinking enhances reasoning, not knowledge recency — if the answer postdates the Knowledge Cutoff, no thinking tokens help.

A subtler third case: very large budgets on certain tasks can cause over-deliberation — the model, having reached the right answer, continues exploring alternatives and occasionally talks itself into a wrong one. For problems with a clear optimal answer, a medium budget tends to be more stable than the maximum.

Real-World Example +

Real difference with vs without thinking: ask Claude to solve an algorithm problem requiring the longest increasing path in a matrix.

Without Extended Thinking, Claude might give a BFS solution that looks correct but fails on an edge case such as all elements being equal.

With a 10K-token thinking budget, Claude works through DFS+memoization in its scratchpad, confirms which direction counts as increasing, tests edge cases internally, and outputs a solution with complete boundary handling. The final code is not shorter, but the bug that would fail test case three is gone.

Diagram

Feel free to share. Please credit the source.

Common Misconceptions +

✕ Misconception 1

x Myth 1: Extended Thinking makes Claude smarter, so use it for everything. Extended Thinking deepens reasoning but doesn't give Claude more facts or newer data. Asking about today's weather with a large thinking budget just slows the answer and costs more. Its value is only on reasoning-intensive tasks.

✕ Misconception 2

x Myth 2: The thinking budget is a fixed charge you always pay in full. The budget is a ceiling, not a guaranteed spend. A simple question given a 20K-token budget may finish in 2K tokens. You pay actual usage, not the cap — so setting a generous ceiling for complex tasks doesn't mean overpaying on simple ones.

✕ Misconception 3

x Myth 3: Not seeing the thinking means Claude might be faking it. The thinking tokens are genuinely computed; they just aren't shown by default. The invisibility is a design choice — the scratchpad contains half-formed ideas and dead-ends that aren't meaningful to end users. You need the final organized answer, not a transcript of Claude rejecting its first three attempts.

The Missing Link +

Direct Impact

Extended Thinking's core trade-off is reasoning quality vs speed and cost.

With it on, Claude can work through complex problems before answering, improving quality and reliability for multi-step logic, math, and deep analysis. The cost is higher latency and higher spend.

Without it, responses are fast and cheap — sufficient for direct Q&A, knowledge lookups, and creative tasks that don't require deep reasoning chains. The test: would an expert need a whiteboard to give a reliable answer to this? If yes, Extended Thinking has value; if no, it just makes you wait longer.

Ask a Question

Related Terms

Useful Resources

Claude API Status → Model Pricing → Prompt Playground → Token Counter → MCP Servers → LLM Benchmarks → Model Comparison →