Bible Network Crypto DeFi Onchain RWA AI Agent Stablecoin Chain SAFU CryptoTax DeFAI AGI Claude Me Claude Skill Claude Design Claude Cowork
Independent Media
Not affiliated with any project
Exploring the Frontier of AI Intelligence
claude-me.com
LATEST
MCP for Developers: Build Your First MCP Server from Scratch  ·  MCP for Non-Developers: Connect Claude to Your Everyday Tools Without Writing a Single Line of Code  ·  Claude Projects Deep Review: Three Months of Real Use — My Honest Assessment  ·  Claude vs ChatGPT 2026: An Honest Comparison — Not Who's Better, But Which One Is Right for You  ·  The Right Way to Debug With Claude: Not Pasting Errors and Waiting, But Systematic Problem-Finding Together  ·  Using Claude to Write Weekly Reports: From Messy Notes to a Report Your Manager Will Actually Read
Glossary · Core Concepts

Token

Core Concepts 新手

30-Second Version · For the impatient
The basic unit AI uses to process text — roughly 3/4 of an English word, or about 1-2 tokens per Chinese character.
Full Explanation +
01 · What is this?
A token is the smallest unit a language model uses to process text — it doesn't equal a character, a word, or a sentence. English tokenization roughly follows this logic: common words are one token (cat = 1 token), while uncommon words get split (tokenization = 4 tokens). Chinese characters are typically 1-2 tokens each, because tokenizers were historically optimized for English. The most practical reason to understand tokens is cost: Anthropic API billing is by token, not by character count, not by message count. The tokens in your prompt plus the tokens in Claude's response equals the cost of that API call. A 1,000-character Chinese article is roughly 1,500-2,000 tokens; the same information in English is about 1,300 tokens — Chinese costs 20-50% more than English.
02 · Why does it exist?
At its core, a language model operates on vectors (embeddings), and the input unit is a token, not a character. If characters were used, the vocabulary would explode in size, making it impossible for the model to learn effectively. Tokenization is an engineering trade-off: using common subword units allows processing of all languages while keeping the vocabulary manageable (typically 50,000-100,000 tokens). English has the most training data so tokenizers are most efficient for English; CJK languages have lower token efficiency, directly affecting API costs.
03 · How does it affect your decisions?
Token counts affect three key decisions: Cost estimation: Before building an API application, estimate tokens in your prompt and Claude's expected response, multiply by the API unit price. Don't use character counts — for Chinese text, character count times 1.5-2 gives a closer approximation. Context Window allocation: Every token in your System Prompt is billed again on every API call. 2,000-token System Prompt times 10,000 calls per day = 20 million input tokens per day just from the System Prompt. Streamlining it is the most direct cost optimization lever. Language selection: If your application is bilingual, the same content in English uses 20-50% fewer tokens than in Chinese. Many developers write System Prompts in English even for Chinese-facing products — a reasonable cost efficiency decision.
04 · What should you do?
Three immediately actionable steps: Use Anthropic's token counting tool: In the Console Workbench, entering your prompt directly shows the token count. Calculate your cost structure before deploying — don't discover unexpected costs after launch. Compress your System Prompt: Remove redundant explanations and examples. Replace descriptive text with directive text. Reply in Traditional Chinese uses more than twice fewer tokens than a verbose equivalent instruction. Track token usage: Log the input/output token counts from the usage field in every API response. When a conversation's input tokens exceed a preset threshold, trigger a prompt for the user to start a new conversation, or automatically compress conversation history.
Real-World Example +
Scenario: Building an AI customer service chatbot processing 10,000 conversations per day. Initial setup: System Prompt 3,000 tokens, average conversation 5 turns (50 user + 200 Claude tokens per turn), using claude-sonnet-4-5. Cost calculation: - Input tokens per conversation: System Prompt (3,000) + 5-turn history (1,250) = 4,250 tokens - Output tokens per conversation: 5 replies (1,000) = 1,000 tokens - Cost per conversation: ~$0.028, Daily: $280, Monthly: ~$8,400 After System Prompt optimization (compressed from 3,000 to 800 tokens): - Monthly cost: ~$6,300 - Monthly savings: $2,100 (~25% reduction) Key insight: System Prompt is a cost multiplier — it gets billed again on every single call. Compressing it is the most direct cost optimization, requiring no architectural changes.
Diagram
Token — Billing Unit How text is split before processing English Hello , world ! 4 tokens 100 words ≈ 75 tokens cost baseline: $1.00 efficiency ████████░░ CJK Characters · · 5–7 tokens 100 chars ≈ 100–200 tokens cost: $1.30–2.00 vs English efficiency ██████████ (30–50% more) ⚠️ CJK uses 30–50% more tokens than English for the same information API cost = (input tokens + output tokens) × unit price Claude Me · claude-me.com
Feel free to share. Please credit the source.
Common Misconceptions +
✕ Misconception 1
× Myth 1: Tokens equal word/character count, so you can estimate costs using character counts. 100 English words is about 75 tokens; 100 Chinese characters is about 100-200 tokens. Using character counts to estimate Chinese API costs may underestimate by 50-100%.
✕ Misconception 2
× Myth 2: Input tokens are cheaper than output tokens, so a long System Prompt is fine. Every token in your System Prompt is billed again on every API call. 3,000-token System Prompt times 10,000 calls per day = 30 million input tokens per day just from the System Prompt.
The Missing Link +
Direct Impact
Token efficiency vs. expression completeness: Compressing prompts saves cost but may lose important context. Every compression requires testing. It's not about being as short as possible, but achieving the required behavior with the minimum tokens. Chinese vs. English prompts: English costs less, but a Chinese System Prompt lets Claude respond more naturally in Chinese, typically producing better output quality. The cost-quality trade-off requires evaluation based on your specific scenario.
Ask a Question
Please enter at least 10 characters