Context Windows Will Keep Growing (But That Won't Solve the Problem)

Context windows will keep growing, but increasing the volume of context won't resolve the core issues of vibe coding.

Illustration showing the difference between context and attention in AI agents

The industry is racing to make context windows bigger. 100k tokens, 200k, now we're pushing toward millions. Every model release touts a larger window like it's the breakthrough we've been waiting for.

But bigger windows just create more noise.

The Information Problem

I teach a class on product management, and one concept always catches people off guard: PMs don't suffer from a lack of information. They're drowning in it. Customer feedback, analytics, competitive intel, team opinions, stakeholder requests. The volume of information available is overwhelming.

The problem isn't gathering more information. It's knowing what to pay attention to.

A PM who listens to every piece of feedback equally will build a frankenstein product that satisfies no one. A good PM curates attention: they decide which signals matter and which are noise.

AI agents have the same problem.

Throwing a 300-page printer manual into an agent's context window doesn't help it fix a paper jam. It needs the 2 pages on troubleshooting jams, not the other 298 pages on network setup, toner replacement, and driver installation.

Access to information is different from loading all of it into working memory. Humans understand this instinctively. We keep reference materials available but don't try to hold everything in our head at once. We pull in what's relevant when it's relevant.

Agents need the same capability.

When Attention Fails

One of our customers is building a new product from scratch with a small team. They're moving fast, iterating, figuring things out as they go. Because of this, they haven't established formal conventions around tooling yet. No documented decisions about testing frameworks (Vitest vs Jest), styling approaches (Tailwind vs CSS-in-JS), or architecture patterns.

Their coding agent had a different interpretation of "moving fast."

Every new feature came with a new testing framework. One feature used Jest. The next used Vitest. Then back to Jest. Then something else entirely. Same with component patterns: the agent would scaffold React components one day and suggest a completely different approach the next.

The agent wasn't broken. It had access to knowledge about all these frameworks. The problem was it had no way to know which choices had already been made. No sense of "we decided on Jest last week, stick with that."

Without directed attention, the agent treated every task as a blank slate. Maximum flexibility, zero consistency.

We had to build a concept we call "decisions" into Brief: explicit declarations like "We use Jest for testing" or "We build components in React" that constrain the agent's attention. Not because the other options are bad, but because consistency matters more than theoretical perfection.

Why This Happens

Transformer models don't naturally distinguish between what's important and what's not. Everything in the context window gets processed with relatively equal weight. The model attends to all of it, looking for patterns and relationships.

This is powerful for tasks where everything is relevant. But for complex, ongoing work like building software, most of the available information is noise for any given task.

Change a button color, and the agent sees the entire frontend codebase. Every component, every pattern, every architectural decision. Without guidance about what matters for this specific change, the agent might decide this is the perfect time to refactor your component library.

Infinite context windows won't fix this. They'll make it worse. More information means more potential distractions, more opportunities to optimize for the wrong thing, more ways to miss the actual signal.

The Solution Isn't More Context

The industry is solving for the wrong constraint.

Bigger context windows are impressive technically. They're useful for certain tasks. But they don't address the fundamental problem: agents need to know what to pay attention to, not just what they have access to.

This requires a different primitive: attention.

Brief solves this through structured context management. Teams can define decisions, document conventions, establish boundaries around what the agent should consider for different types of tasks. The agent has access to the full codebase and all relevant documentation, but its attention is directed toward what actually matters.

When you tell Brief "we use Jest," that's not just metadata. It shapes how the agent approaches testing tasks. When you define your ICP, that's not background information. It directs architectural choices.

The agent still has access to knowledge about Vitest, Mocha, and every other testing framework. But its attention is constrained to the choice you've already made.

Absent this kind of directed attention, the alternative is keeping a Notion doc and spoon-feeding prompts one by one, manually curating what information goes into each request. Treating the agent like a very sophisticated find-and-replace tool instead of a collaborator.

That works, but it doesn't scale. It defeats the purpose of having an agent in the first place.

What Changes

The difference between an agent with infinite context and an agent with directed attention is the difference between a PM who reads every email and a PM who knows which emails matter.

Both have access to the same information. Only one ships coherent products.

Context windows will keep growing. Models will keep improving. But until we solve for attention, agents will continue to make technically correct decisions that miss the point entirely.

Stay in the Loop

Get notified when we publish new insights on building better AI products.

Get Updates
← Back to Blog