Skip to main content
AI & Search

Context Window

Token Window

Portrait of Lukas Horvath, co-founder of Roelu Studio
Lukas HorvathCo-founder

What is Context Window?

A context window is the maximum number of tokens an AI model can process in a single interaction, including both the input prompt and the generated response. A token is roughly three-quarters of a word in English. Modern models in 2026 range from 128,000 tokens (Claude Haiku, GPT-4o) to 1 million or more (Claude Opus, Gemini). The context window determines how much background, source material, or conversation history the model can hold at once.

Why it matters

Context window size sounds technical but it changes what AI tools can actually do for your team. A 4,000-token model could not read a strategy doc. A 200,000-token model reads the whole doc, three competitor sites, and last quarter's analytics in one shot, then writes a briefing across all of it. The leap from short to long context is what made AI useful for serious knowledge work. When evaluating an AI vendor, ask what context window the underlying model uses and whether the product takes advantage of it. Many wrappers waste a huge context on a tiny prompt.

How it works

Every prompt and response is tokenized — broken into chunks the model can process. The model has a fixed limit on the total tokens it can hold in its working memory. Send more than the limit and the older tokens get dropped, the request fails, or the system summarizes and re-sends. Longer context windows require more compute per query, which is why pricing scales with usage. Some models also exhibit a 'lost in the middle' effect, where they pay more attention to the start and end of a long context. Good prompt design accounts for that — put the most important information first.

  • A type of AI trained on huge volumes of text that can read, write, and answer questions in plain language — the engine behind ChatGPT, Claude, Gemini, and most…

  • Claude

    AI & Search

    Anthropic's AI assistant, known for careful reasoning, long context windows, and being a favorite among developers and writers who want a model that does not…

  • The practice of writing instructions to an AI model in a way that gets a reliable, useful result — part technical writing, part specification, part figuring…

  • A technique that lets an AI model look up fresh, specific information from a database or the web before answering, so it does not have to rely only on what it…

  • AI Agent

    AI & Search

    An AI system that can take actions on its own — booking meetings, sending emails, querying databases, running code, updating records — instead of just…

  • An open standard from Anthropic that lets AI models plug into external tools, databases, and services the same way every time — think of it as USB-C for AI…