# AI vocabulary

> Core terms for talking to and about AI. Models, tokens, prompting, behavior, RAG, agents, controls, safety, image generation.

Source: https://www.juanmnl.com/notes/ai-vocabulary.html

## Core Concepts

The foundations. Get these and the rest of the vocabulary clicks into place.

### LLM / large language model

A model trained on vast text that generates language by predicting the next token. Claude, GPT, Gemini are LLMs.

_Say:_ "Which LLM is best for this task?"

### Token

The chunks a model reads/writes, roughly ¾ of a word. Pricing and limits are counted in tokens.

_Say:_ "How many tokens will this prompt use?"

### Parameters / weights

The learned numbers that store what a model "knows." More isn't always better.

_Say:_ "How many parameters does this model have?"

### Neural network

Layers of interconnected "neurons" that transform inputs to outputs, the underlying structure.

_Say:_ "It's a neural network under the hood."

### Transformer

The architecture behind modern LLMs; its "attention" lets it weigh relationships across text.

_Say:_ "Most LLMs use a transformer architecture."

### Training vs inference

Training = learning from data (slow, expensive, one-time); inference = using it to answer (each request).

_Say:_ "This is inference cost, not training."

### Pre-training / fine-tuning

Pre-training builds broad ability; fine-tuning adapts a model to a specific domain or style.

_Say:_ "Should we fine-tune or just prompt well?"

### Knowledge cutoff

The date past which the model has no built-in knowledge, needs search for newer facts.

_Say:_ "What's the knowledge cutoff, search for current info."

## Prompting

How you talk to a model. Small wording changes produce big quality changes.

### Prompt

The text you send the model, instruction, question, and any context.

_Say:_ "Let me refine the prompt."

### System prompt

A behind-the-scenes instruction setting the model's role, rules, and persona for the whole chat.

_Say:_ "Set a system prompt that defines the role."

### Zero / few-shot

Zero-shot = no examples; few-shot = a few examples to show the pattern you want.

_Say:_ "Give it a few-shot example of the format."

### Chain-of-thought

Asking the model to reason in steps before answering, improves hard/logic problems.

_Say:_ "Think step by step before answering."

### Role / persona

Telling the model who to be, which shifts its vocabulary, depth, and defaults.

_Say:_ "Act as an experienced product manager."

### Context window

How much text (prompt + history + output) the model can consider at once. Overflow gets dropped.

_Say:_ "Is this within the context window?"

### Structured output

Asking for a specific machine-readable shape (JSON, table, XML) so output is parseable.

_Say:_ "Return the result as JSON with these fields."

### Prompt template

A reusable prompt with variable slots filled in per request.

_Say:_ "Make this a reusable prompt template."

## Capabilities & Behavior

What models can do, and the quirks to watch for.

### Hallucination

When a model states false information fluently and confidently. Always verify facts.

_Say:_ "Could this be a hallucination? Cite sources."

### Grounding

Tying answers to provided/retrieved evidence so they're factual, not invented.

_Say:_ "Ground the answer in the attached docs only."

### Reasoning / thinking

A model working through a problem internally before answering, better on complex tasks.

_Say:_ "Use extended reasoning for this hard problem."

### Multimodal

Handling more than text, images, audio, video, as input and/or output.

_Say:_ "It's multimodal, paste the screenshot in."

### Alignment

Shaping a model to behave according to human intentions and values.

_Say:_ "That refusal is the alignment training."

### Emergent ability

Capabilities that show up in big models but not small ones, not explicitly programmed.

_Say:_ "That's an emergent capability of larger models."

### Non-determinism

The same prompt can give different outputs each time (unless temperature is 0).

_Say:_ "Set temperature to 0 for reproducible output."

### Latency / throughput

How fast a model responds and how many tokens it streams per second.

_Say:_ "Pick a faster model for low latency."

## Context, RAG & Memory

How models get information beyond their training, the basis of most real AI apps.

### RAG / retrieval-augmented generation

Fetching relevant documents and feeding them to the model so answers use your data.

_Say:_ "Use RAG over our docs to answer support questions."

### Embedding

A list of numbers representing text's meaning, so similar ideas sit close together.

_Say:_ "Embed the documents for semantic search."

### Vector database

A store optimized to find the embeddings closest to a query (Pinecone, pgvector, Weaviate).

_Say:_ "Store the embeddings in a vector database."

### Semantic search

Finding results by meaning rather than exact words, "car" matches "automobile."

_Say:_ "Use semantic search, not keyword matching."

### Chunking

Splitting long documents into smaller pieces that fit the context and retrieve precisely.

_Say:_ "Chunk the docs into ~500-token passages."

### Memory

Storing facts/preferences so the AI recalls them in future conversations.

_Say:_ "Remember my preferences across sessions."

### Context engineering

Deliberately assembling the right instructions, examples, and data into the prompt.

_Say:_ "This is a context-engineering problem."

### Reranking

A second pass that re-scores retrieved results so the most relevant reach the model.

_Say:_ "Add a reranker to improve retrieval quality."

## Agents & Tools

When a model doesn't just talk, but acts, calling tools and taking steps toward a goal.

### Agent

An AI that loops, plans, takes actions, checks results, to complete a multi-step goal autonomously.

_Say:_ "Have an agent handle the whole workflow."

### Tool use / function calling

Letting a model call external functions/APIs, search, code, databases, to do real work.

_Say:_ "Give it tools to search and send email."

### MCP / model context protocol

An open standard for connecting AI to tools and data sources (Slack, GitHub, files) in a uniform way.

_Say:_ "Connect it via an MCP server."

### Orchestration

Coordinating multiple steps, tools, or sub-agents into a reliable workflow.

_Say:_ "Orchestrate the steps in a defined order."

### Subagent

A separate agent spun up to handle a focused subtask (research, verification) in parallel.

_Say:_ "Use a subagent to verify the results."

### Autonomy / human-in-the-loop

How much an agent does alone vs pausing for human approval at key steps.

_Say:_ "Keep a human-in-the-loop before it sends anything."

### Workflow vs agent

A workflow follows predefined steps; an agent decides its own steps dynamically.

_Say:_ "A fixed workflow is safer here than a free agent."

### Guardrails

Constraints (allowed tools, scopes, checks) that keep an agent safe and on-task.

_Say:_ "Add guardrails so it can't delete files."

## Generation Controls

The knobs that tune how a model generates. Naming these gives you fine control.

### Temperature

Randomness. Low = focused/repeatable; high = varied/creative.

_Say:_ "Lower the temperature for consistent answers."

### top-p / nucleus sampling

Limits choices to the most probable tokens summing to p, another diversity control.

_Say:_ "Tune top-p instead of temperature."

### max tokens

A ceiling on how long the response can be.

_Say:_ "Cap the output at 300 tokens."

### Stop sequence

Text that, when generated, halts the output, useful for clean formatting.

_Say:_ "Add a stop sequence so it ends cleanly."

### Streaming

Sending the response token-by-token as it's produced, so the user sees it type out.

_Say:_ "Stream the response for a snappier feel."

### Seed

A fixed starting value that makes random generation repeatable.

_Say:_ "Set a seed so we can reproduce this."

## Quality & Safety

Measuring whether AI works, and keeping it from misbehaving.

### Evals / benchmarks

Tests that measure model quality on a task. Build your own to know if changes help.

_Say:_ "Write evals to measure if this prompt is better."

### Bias

Systematic skew inherited from training data that can produce unfair outputs.

_Say:_ "Check the outputs for bias across groups."

### Prompt injection

Hidden instructions in input data that trick a model into ignoring its real task.

_Say:_ "Harden it against prompt injection."

### Jailbreak

Crafted prompts that try to bypass a model's safety guidelines.

_Say:_ "Test whether it resists jailbreak attempts."

### Red-teaming

Deliberately probing a system for failures and harms before users find them.

_Say:_ "Red-team the assistant before launch."

### Guardrails / moderation

Checks that block unsafe inputs/outputs around the model.

_Say:_ "Add a moderation guardrail on outputs."

### Human feedback / RLHF

Training a model using human ratings of its responses to align behavior.

_Say:_ "Those preferences come from RLHF."

### Confidence / calibration

Whether a model's stated certainty matches how often it's actually right.

_Say:_ "Ask it to rate its confidence."

## Image & Media Generation

The vocabulary for generating images, audio, and video.

### Diffusion model

Generates images by starting from noise and progressively refining (Stable Diffusion, Midjourney, DALL·E).

_Say:_ "Generate it with a diffusion model."

### Text-to-image

Creating an image from a text description (the prompt).

_Say:_ "Text-to-image: a minimalist fox logo."

### Prompt weighting

Emphasizing or de-emphasizing parts of an image prompt with weights.

_Say:_ "Weight the prompt toward 'minimalist'."

### Negative prompt

Telling the model what to avoid in the image.

_Say:_ "Add a negative prompt to remove text."

### Seed (image)

Fixes the starting randomness so you can reproduce or vary an image consistently.

_Say:_ "Keep the seed and just change the color."

### Inpainting / outpainting

Regenerating a masked region (inpaint) or extending the canvas (outpaint).

_Say:_ "Inpaint to replace just the background."

### Upscaling

Increasing an image's resolution while adding plausible detail.

_Say:_ "Upscale it to 4K."

### ControlNet / conditioning

Steering generation with an extra input, a sketch, pose, or depth map, for precise control.

_Say:_ "Condition it on this pose sketch."