Glossary · core-concepts

LLM (Large Language Model)

core-concepts 新手

30-Second Version · For the impatient

An AI model trained on massive text data with the ability to understand and generate human language. Claude, GPT-4, and Gemini are all LLMs. "Large" refers to the enormous number of model parameters, not the length of text output.

Full Explanation +

01 · What is this?

LLM (Large Language Model) is a type of AI model trained on massive text data, primarily capable of understanding and generating human language. Claude, GPT-4o, Google's Gemini, Meta's LLaMA — these are all LLMs.

"Large" refers to the model's parameter count — billions or even hundreds of billions of parameters (think of them as the model's "adjustable dials") — tuned during training to enable the model to learn language patterns.

LLM's core operating logic is "predicting the next Token" — given preceding text, the model calculates the probability of every possible next token, then outputs the most probable one at a time, building a complete response. This mechanism explains many LLM behavioral characteristics, including why it's sometimes accurate and sometimes "hallucinates."

02 · Why does it exist?

LLMs trace their origins to language modeling research. The 2017 paper "Attention is All You Need" introduced the Transformer architecture, whose self-attention mechanism lets the model simultaneously consider the entire input sequence — producing a leap in capability. Researchers then discovered: making Transformers larger and feeding them more data produces "emergence" — the model suddenly begins doing things its smaller versions couldn't, such as logical reasoning and code writing. Claude was developed by Anthropic in this context, focused on making powerful models that remain safe and aligned with human values.

03 · How does it affect your decisions?

Understanding what an LLM is has several direct impacts on your Claude use:

Explaining Hallucination's source: LLMs output "the statistically most probable text sequence," not "the correct answer." When they lack sufficient information, they don't say "I don't know" — they output the most plausible-sounding answer, even if wrong.

Explaining why "same question, different answers" is normal: LLM probabilism means outputs may vary slightly across runs (unless Temperature is set to 0).

Most important understanding: LLMs are language generation tools, not "truth machines." Leverage their reasoning and generation capabilities — but always verify important information.

04 · What should you do?

After understanding LLMs, concrete implications for daily Claude use:

Always verify important information: Claude outputs the highest-probability answer, not a guaranteed-correct one. Medical, legal, financial decisions — always verify through other channels.
Treat Claude as a "thinking partner," not an encyclopedia: LLMs excel at reasoning, analysis, generation, and rewriting — not storing correct facts.
Understand that "same question, different answers" is normal: for highly consistent output, add explicit format requirements or set Temperature to 0 (API users).
Asking multiple times can improve quality: if unsatisfied, try rephrasing or adding role definition.

Real-World Example +

A thought experiment for understanding the "LLM predicts the next token" mechanism: imagine playing a fill-in-the-blank game. Prompt: "The weather is beautiful today, so I've decided to go ___". The human brain fills in: "to the park," "for a walk," "hiking" — reasonable answers that fit the context. This is essentially what an LLM does: calculate the probability distribution across possible completions and output high-probability options.

This mechanism is powerful but also explains limitations: if you ask "who won the 2026 Nobel Prize in Physics" and this isn't in Claude's training data, it won't say "I don't know." It will generate a plausible-sounding but potentially completely wrong name. This is hallucination's fundamental source — not lying, but predicting the most likely answer, where that answer happens to be wrong.

Diagram

Feel free to share. Please credit the source.

Common Misconceptions +

✕ Misconception 1

× Misconception 1: LLMs are "thinking" or "understanding" in the same way humans do. LLMs' core is probabilistic token prediction, not thinking or understanding in the human sense. When Claude gives you a logically coherent answer, it's outputting the statistically most probable token sequence — not reasoning through the problem as a human would.

✕ Misconception 2

× Misconception 2: LLMs are all of AI — all AI is language models. AI also includes: computer vision AI (identifying objects in images), image generation AI (Midjourney, DALL-E), reinforcement learning AI (chess AI, game AI), and more. Equating "AI" with "LLM/ChatGPT-type tools" is one of the most common misconceptions today.

The Missing Link +

Direct Impact

LLMs are the most general-purpose, flexible form of AI technology today — but with fundamental limitations. Advantages: extremely versatile across nearly all language tasks; conversational interface keeps barriers low; capable of reasoning and analysis; continuously evolving. Fundamental limitations: outputs are probabilistic — cannot guarantee complete correctness; knowledge has a cutoff; hallucination is an architectural characteristic; high resource consumption. Most important understanding: LLMs are language generation tools, not "truth machines."

Ask a Question

Related Terms

Useful Resources

Claude API Status → Model Pricing → Prompt Playground → Token Counter → MCP Servers → LLM Benchmarks → Model Comparison →