The context window is the maximum amount of information that an
AI model can consider simultaneously in a conversation or task. It determines how much previous information it can keep present to generate coherent and contextually relevant responses.
Think of the context window as the AI's workspace. If you have a small desk, you can only have a few documents open at once; if it's large, you can work with many more. When the window fills up, the AI must discard the oldest information to incorporate new information.
This capacity is measured in
tokens, which are the basic pieces into which AI divides text: words, parts of words, spaces, and punctuation marks. Each
token takes up space in the window. Depending on the model's design and purpose, some handle 4,000
tokens (about 3,000 words or a short article), while others can reach 128,000
tokens or more (about 96,000 words or a complete book).
In practice, this capacity difference determines what type of tasks you can perform. A large window means you can analyze complete reports, maintain long conversations without losing track, work with extensive code, or have the AI compare multiple documents simultaneously. A small window forces you to divide large tasks into smaller fragments, as the model loses sight of the initial information as processing advances.