I Almost Built the Wrong Thing

February 2, 2026 9 min read

The first version of Brief in my head was a product review bot. Before shipping, you would run work through simulated versions of your product leader, design lead, and engineering lead. An AI council would dunk on your feature so you could fix it before facing the real humans. It felt clever. It was wrong.

It was wrong because it copied the surface of the meeting instead of the substance. Product review was never about the performance. It was about transmitting context and judgment. I almost automated the meeting when I should have been automating the context.

What I learned from real product reviews

The best product leaders stay deeply involved in decisions long after most CEOs or execs would have delegated. Gates was infamous for "think weeks" and brutal sessions that left teams sweating but clear on what mattered. Bezos forced clarity through six-page memos and PR/FAQ drafts. At Yammer, David Sacks treated product review like a dojo. My first review with him was about a single button. He zoomed out to the history of computing, the place of the iPad, and zoomed back in to that button. The point was not the button. It was the stack of context behind it.

Those sessions were not about catching mistakes. They were about aligning on how to think. When you left, you had a mental model that guided hundreds of micro-decisions without Sacks or Bezos in the room.

Why "meeting bots" miss the point

A bot that says "this is off-brand" is a parlor trick unless it carries the why: the audience, the positioning, the constraints, the quality bar. A simulation of an exec's tone without their context is cosplay. The meeting is a proxy for context transmission. Automating the proxy and ignoring the payload recreates the bottleneck, just faster and more shallow.

The new bottleneck

Agents collapsed build time. A solo engineer can ship something meaningful in hours. The product review cycle did not shrink with it. You cannot schedule a two-hour review for every three-hour build. Hiring more PMs or writing longer specs does not fix it. The bottleneck is getting the right context into the hands of the builder and the agent at the moment of decision.

If the context stays in heads and slides, you will get 10x output of the wrong thing. The fix is not more critique; it is turning context into infrastructure.

What context actually is

Strategy: who we serve, what we promise, what we will not do.
Positioning and tone: how we speak to different segments.
Non-negotiables: security, privacy, compliance, performance budgets.
Technical choices: stack, patterns, dependencies, logging, testing.
Design principles: component usage, spacing, motion, accessibility.
History: what we tried, what we killed, and why.
Current priorities: activation vs expansion, speed vs polish, revenue vs retention.

In human-led teams, this context spreads through reviews, docs, hallway chats, and scars from past incidents. With agents, those channels do not reach the code window. That is why the agent "hallucinates" choices. It fills gaps with patterns, not intent.

Context as infrastructure

Turning context into infrastructure means:

Capturing decisions as first-class objects, not buried in meeting notes.
Linking those decisions to work so they travel with tasks.
Making them machine-readable so agents consume them automatically.
Keeping them current with lightweight rituals.
Adding drift detection so violations surface fast.
Attaching rationale so future you knows why a constraint exists.

The goal is to let builders move without waiting for a meeting while staying inside the rails of product intent.

A simple stack for context delivery

1) Decision register

ICP and tone rules per segment.
Technical standards: testing, data fetching, logging, error handling.
Performance budgets and SLOs.
Security and privacy guardrails.
Rollout rules and feature flag expectations.
Design system usage rules.

2) Briefs that bind

One page per initiative/task: user, goal, success, constraints, linked decisions, rationale.
Short enough to read in a minute. Structured so agents can parse.

3) Ingestion and distribution

Pull from calls, docs, code, support to propose new decisions.
Push current decisions and briefs into agent runs automatically.

4) Feedback and drift

When work deviates, flag it. Decide to update the decision or enforce it.
When a fix repeats, promote it to a decision.
Keep a log of why decisions changed.

5) Lightweight rituals

Daily decision hygiene: add, retire, highlight changes.
Weekly direction checks: 3–5 calls for the week, recorded as decisions.
Monthly strategy pulse: adjust durable direction, prune decisions.

This stack replaces the need for a person to be in every review. The thinking travels without the meeting.

Levels of context to encode

Company: mission, ICPs, pricing philosophy, market positioning.
Product: jobs to be done, activation priorities, retention levers, quality bars.
Domain: compliance rules, privacy stance, data residency, risk tolerance.
Design: tone by segment, component usage, motion guidelines, accessibility rules.
Engineering: dependency policy, testing standards, performance budgets, observability defaults.
History: past experiments, what was killed and why, known traps.

You do not need to encode everything at once. Start with the decisions that affect current work. Expand as you see repeat misses.

What a modern product review can look like

Before work starts: brief and decisions get pulled in. Agent drafts within rails.
As it runs: drift alerts if it tries to add a dependency or tone outside the rules.
Review: humans check edge cases, copy nuance, and outcomes. They do not debate frameworks because those were set.
After ship: outcomes feed back. If users get confused, tone rules update. If performance sags, budgets tighten. Decisions evolve.

The "review" becomes a fast loop of confirm and adjust, not a gate where context finally shows up.

A story of building without the meeting

A team needed a new onboarding path for enterprise buyers. Old world: write a spec, schedule reviews, wait for sign-off. Agents would have generated a generic flow and copy. Instead they:

Recorded decisions: enterprise tone, mandatory audit logging, P95 latency target, reuse of existing components, no new dependencies, email confirmations with legal language.
Wrote a brief with the user, goal, and success metric.
Fed decisions and brief to the agent. First draft respected tone, logging, and performance patterns. Review focused on minor copy and edge cases.
Post-ship, support noted confusion on one step. They updated the copy rule and added a decision about ordering fields for clarity. Future work inherited it.

No meeting theater. The context was live, structured, and available at generation and review.

Why this beats more documentation

Documentation is necessary. It is also brittle and slow. Context infrastructure differs because:

It is structured: Decisions have fields and values, not paragraphs.
It is live: Changes propagate to agents and humans immediately.
It is scoped: Only the relevant decisions attach to a task.
It is enforced: Drift alerts when work violates it.
It is compact: You do not need to read a wiki to start.

Docs explain. Context infrastructure guides.

Anti-patterns to avoid

Meeting replicas: AI that mimics critique without providing the underlying constraints.
Giant prompt templates: walls of text pasted into every request that nobody maintains.
Stale decisions: rules that do not update, so people ignore them.
Hidden rationale: decisions with no "why," forcing re-litigation later.
Process bloat: adding ceremonies instead of making context easy to access.

Metrics that show context is working

Time from insight to updated decision.
Decision adoption rate: tasks that link to current decisions.
Drift events per week and time to resolution.
Rework hours attributed to missing or stale context.
Cycle time with and without agent context injection.
Support tickets tied to tone, compliance, or reliability misses.

If these move in the right direction, you are replacing meetings with context that sticks.

Signals you are getting it right

Agents stop introducing random dependencies and tone mismatches.
Rework drops because constraints are known upfront.
PMs and leads spend less time repeating the same rationale.
Drift alerts catch violations early.
New hires ramp faster because the decision set gives them the mental model.
Reviews focus on outcomes and edge cases, not relitigating standards.

Signals you are stuck in meeting land

Every meaningful change waits for review because "that is where the context lives."
Agents ship features that look polished but miss compliance, tone, or performance.
The same debates repeat because the rationale is trapped in someone's head.
Prompts grow longer because you are compensating for missing structure.
Team calendars are full of reviews that mostly restate standards instead of improving the product.

Stakeholders without the theater

Execs still need confidence that quality, risk, and strategy are being upheld. Context infrastructure gives them:

A live decision register they can inspect.
Drift reports that show where standards were challenged and resolved.
Briefs that summarize intent and constraints without decks.
Metrics on rework and adherence.

They get visibility without becoming a bottleneck.

Risks to watch

Overfitting decisions: locking rules too tightly and blocking healthy experimentation. Keep decisions small and revisable.
Context sprawl: too many decisions with no pruning. Run hygiene weekly.
False sense of security: assuming decisions are being applied when they are not wired into agent runs. Automate the injection.
Ignoring humans: context does not replace taste. Use it to augment, not to abdicate judgment.

A rollout you can start this month

Week 1: Create a decision register. Ten items: tone, ICP, dependencies, performance, security, logging, testing, rollout rules, design system usage, analytics.

Week 2: Add briefs for current initiatives. Keep them to one page. Link decisions.

Week 3: Pipe decisions and briefs into agent runs automatically. Stop manual pasting.

Week 4: Add drift alerts. When an agent or PR violates a decision, flag it.

Week 5: Hold a short decision hygiene meeting daily. Add new decisions, retire stale ones.

Week 6: Review metrics: rework, drift events, prompt size, cycle time. Tune decisions and briefs.

None of this requires a new offsite. It requires treating context as a product surface.

What almost building the wrong thing taught me

The product review bot idea was appealing because it mirrored something familiar. It also would have kept the bottleneck in place. The value of those legendary reviews was not the sparring. It was the transfer of judgment. Judgment comes from context. Context can be captured, structured, and distributed. When you do that, you do not need a simulated panel. You need a reliable way to put the CEO's thinking, the design lead's taste, and the engineering guardrails into the system your agents and engineers use every day.

That is why Brief is context infrastructure, not a meeting bot.

Stay in the Loop

Get notified when we publish new insights on building better AI products.

Get Updates

← Back to Blog