§ Essays / Hidden Mechanics

The Build-vs-Validate Trap

Validation isn't a virtue. It's a tool, and you're likely using it at exactly the wrong moment.

25 May 2026 · 18 min read

1. The trap

When I started building my own ventures, two pieces of advice kept ricocheting around inside my head, both stated as gospel by people I respect.

Camp one: “Building is cheaper and faster than ever. The only signal that matters is a real sale to a real customer, and you only get that with a real offer in the market that you can actually deliver. No trap doors, no waitlists. Anything else is a self-righteous way to procrastinate.”

Camp two: “Don’t write a line of code without doing validation first. Every dollar and every minute you spend before validating your offer might as well be lit on fire. Stop whatever you’re building and go get in front of customers.”

The first camp assumes the cost of building is negligible. The second camp assumes your target customers are lining up to be interviewed. Neither is universally true, and the gap between them is where most of us actually live.

So as I’m building, I keep wrestling with the same three questions:

When I’m building, should I be validating instead?
When I’m validating, should I be building instead?
And honestly, is whatever I’m doing right now just stalling?

This is the trap. We treat validation as a virtue, where more equals better and less equals lazy. But validation isn’t a virtue. It’s a tool, with a cost. And like any tool, it has a right job and a lot of wrong jobs. Getting unstuck means knowing which job is in front of you, and whether the tool is sized for it.

2. Validation is behavior prediction

Most of the build-vs-validate noise dies down the moment you write down what validation actually is. But here’s the thing: despite how hard-line people get about it, a precise, operational definition is surprisingly hard to come by. So I’ll propose one.

There are different things you might be validating, and they typically fall into three categories:

Market: do people want this?
Feasibility: can we actually build and deliver this?
Viability: can we build and deliver this profitably?

I’ll touch the latter two for reference, but the essay is going to focus on Market validation, because that’s the one I find myself pondering the most and it’s the one that usually drives the build-vs-validate paralysis.

Here’s the definition:

Validation is the measure of certainty with which a specific trigger will produce a specific action by a specific actor in a specific environment.

Or more compactly: actor + environment + trigger -> action.

That’s it. Each component has to be specific enough that you could observe and verify whether it happened or not. The more vague any one of those is, the less you’ve actually validated.

Perfect validation looks something like this: “If John Smith of Miami, Florida, social security number 123-XX-XXXX, receives an email on Tuesday, June 5, 2026, at 8am offering coffee for five dollars, there is an 85% chance he will buy it.”

actor: John Smith
environment: Tuesday, June 5, 2026, 8am
trigger: email with the offer for a coffee
action: pay five dollars

That looks absurdly specific because it is, and that’s the point. The further any real-world validation gets from that level of specificity, the more interpretation you’re injecting between the data and the conclusion. “Customers want a better workflow tool” is not validation. It’s a hypothesis dressed up in confident clothing.

Once you accept this definition, a few things shake out.

First, the strongest predictor of any future action is past action under the same or improved incentives in a reset environment. The hunger returned, the post-purchase glow faded, the supply that got used up needs replacing. Selling someone a car the day after they bought one is a low-probability event no matter how good your pitch is. Selling them an oil change six months later is a much higher-probability event because the environment has reset.

Second, stated preferences are not revealed preferences. People will tell you they want things they will not actually pay for. They are not lying. They are just bad at predicting their own future behavior, which is what validation is asking them to do.

Charlie Munger: “Show me the incentive and I’ll show you the behavior.”

The implication is that almost everything we call “customer feedback” (interviews, surveys, conversations, smile-and-nod meetings) is information about what someone is willing to say, not information about what they will do. Useful, but not the same thing.

3. Validation is a bet, and every bet is a math problem

If validation is behavior prediction, then the reason you’d ever bother validating is to make a bet. And every bet is a math problem.

The bet itself is what you’ll spend to find out whether the future action happens. The cost side looks something like this:

Full design and development of the thing you’re offering
Starting inventory or operational capacity
Your time, which is non-refundable
Risk of failure, including reputational risk if the offer flops publicly

The payout is harder to write down cleanly because it’s a distribution, not a number. There’s a range of upsides and downsides, each with some likelihood attached. If you wanted to be rigorous about it, you’d run a rough Monte Carlo in your head: imagine 100 versions of the future, ask how many of them produce each outcome, weight accordingly. Most of us don’t actually do that math, but we do it implicitly every time we say “if this works, it could be huge” or “the downside is just my time.”

Validation has a cost too:

Partial design and development (a landing page, a prototype, a manual fulfillment, a paid pilot)
Test inventory or simulated capacity
Your time, in smaller amounts but equally non-refundable
Outreach and the social capital it takes to get in front of prospects, or the cost of the incentive you offer in lieu of it (a lead magnet, a free pilot, a meeting fee)

The core formula:

The value of any validation activity equals the change in the expected payout of the bet, minus the cost of validating.

It’s not “did you validate.” It’s “did this validation activity move your expected payout enough to justify what it cost you to do it?”

A worked example helps. Two, actually. Both from what I’m building right now.

The first is a possible service for container ships: locally-hosted AI models that lift operating revenue, sold to a fleet I’d need to learn from the inside. The question is whether to build the offering or to first spend two weeks on a container ship learning the industry.

Cost to build the offering blind: $20K–$25K, and 4-6 months of focused work.
Cost to validate via the voyage: ~$2,500 all-in (passage, flights, hotel).
Potential payoff if real: $200K to $4M per year in service revenue.
Current certainty the demand is real: under 10%.

Plug in the formula. The validation costs roughly 10% of the build, and there’s no substitute for actually seeing how these ships operate. The voyage would move my certainty by a lot. Even on the conservative end of the payoff range, the expected-payoff swing from a real certainty move dwarfs the $2,500 ticket. The validation buys far more than it costs. Building first would be reckless.

The second is an AI-assisted book-writing service for business owners expanding their personal brand.

Cost to build: 2–3 months of part-time work, ~$100/month in API costs.
Cost to validate by interview: 1–1.5 months minimum, and that assumes I know who the buyer is and can reach them, plus a landing page and offer description.
Cost to validate by reading reviews of similar offerings: a few hours. (This is what I actually did. I’d also bought a similar product myself, was disappointed, and am now designing for exactly the things it got wrong.)
Potential payoff: $50K–$100K/year.
Current certainty: ~60%.

Completely different shape. Building takes 2-3 months; interview-based validation takes 1-1.5 months and produces stated-preference signal at best, not behavioral. The cost gap between validating and building isn’t 1:10 like the voyage. It’s closer to 1:2. And building has compounding upside the voyage doesn’t. A working product makes every subsequent sales conversation easier than a landing page can. There’s not much certainty left to buy that wouldn’t require most of the build anyway.

If the validation activity didn’t meaningfully move your expected payout, you didn’t validate. You busied yourself. That’s worth being honest about, because busy-ness around validation feels a lot like progress.

This is also why “always be validating” is wrong, the same way “never validate” is wrong. Sometimes the bet is small enough and recoverable enough that the cheapest validation is just running the bet. Sometimes the bet is large enough that almost any imperfect validation pays for itself. The math tells you which one you’re in. The dogma doesn’t.

One thing I’m going to flag now and come back to shortly: the “change in expected payout” calculation has more going on than it looks. The new point estimate of the odds is only half of it. How sure you are of the new estimate matters just as much.

4. The validation hierarchy

If the value of validation is the change in your expected payout, then the kinds of validation that move that number matter most.

They aren’t equal. There’s a hierarchy, and it’s roughly ordered by how cleanly each kind of evidence predicts the specific future action you’re trying to forecast.

Tier one: they bought it from you, or from someone they view as interchangeable with you. A real purchase under conditions close to the conditions you’re betting on is the highest-fidelity prediction you can get. They saw the offer, they reached for their wallet, they completed the transaction. Actor, environment, trigger, and action are all observable. If they bought from a competitor with effectively the same offer, that’s almost as strong. They’ve already revealed the demand, just not to you.

Tier two: they tried to buy and were blocked. They effectively took the action you’re predicting, but something outside their control got in the way: the checkout broke, the product was out of stock, the offer pulled too early, or you ran a “painted door” test with a fake purchase link and a post-click explanation that the product is not yet deliverable. This is high-signal because they revealed intent, but it comes with a caveat. They now know you made a promise you didn’t keep. If they believe it was unintentional and/or unlikely to repeat, the relationship survives the same way a friendship survives a missed dinner. If it looks like a pattern, you’ve spent down trust.

Tier three: they bought something adjacent under comparable conditions. Reviews of similar products, sales data for nearby offers, behavior in adjacent markets. Useful for setting priors, but you’re inferring across a gap, and the gap can be larger than it looks.

Tier four: they told you they would. Interviews, surveys, “I’d totally pay for that” in a hallway conversation. Heavily discounted relative to the higher tiers. This is information about what someone is willing to say in a low-stakes moment, not about what they will do in a real one.

One important carve-out at the top of the stack. There’s a class of bet where no proxy exists. The action you’re predicting is novel enough that there’s no past behavior, no equivalent product, no adjacent market to study. This is the situation The Innovator’s Dilemma pokes at: when you’re proposing something genuinely new, there is literally nothing to validate against. The validation ceiling drops to whatever your best guess is, and the only real test is shipping. People will defend a lot of bad bets by claiming they’re in this category. Some of them actually are. Most aren’t.

5. Two dimensions to watch

There are two distinct ways a validation effort can mislead you, and they map to two different pairs of words worth keeping straight.

First pair: precision and accuracy. Borrowed from marksmanship. Precision means tight grouping. Accuracy means hitting the target. They are not the same thing.

You can build a polished, well-engineered product that nobody wants. Precise, and inaccurate. They may have even asked for it.
You can build a janky, duct-taped MVP that proves real demand. Imprecise, and accurate.

Most of the build-vs-validate confusion happens because people validate one and assume they’ve also validated the other. Quibi spent $1.75 billion building Hollywood-quality short-form video for paying subscribers. The build was validated. They never tested whether people would pay for what TikTok and YouTube were giving away for free.

Second pair: the odds, and your certainty about the odds.

This one is easy to gloss over and it costs you a lot when you do. Every validation result has two pieces of information in it. There’s the odds (the probability the future action happens), and there’s your confidence in that probability, which is a separate quantity entirely.

Consider:

“One customer bought it, but I have no idea how much of a fluke that was.” You have a positive data point, but your confidence interval is enormous. The odds might be 80%, or they might be 2%. You haven’t validated; you’ve gathered a data point.
“I ran a careful test with thirty prospects in my target segment and zero of them bought.” You have a negative result, but it’s a high-confidence negative. You’ve validated that the odds are low. That’s still validation, and it’s incredibly valuable, because it lets you kill the bet cleanly instead of continuing to spend on it.

The first scenario is what people usually celebrate as “early validation,” and it isn’t validation, it’s encouragement. The second scenario is what people usually mourn as “the validation failed,” and it is validation, in the strongest sense of the word, because it collapsed your uncertainty.

The math from the bet section actually depends on this. The change in expected payout isn’t just about your new point estimate of the odds. It’s about how much your uncertainty shrank. A shaky 70% estimate is not the same bet as a 70% estimate you hold with conviction. If your validation activity didn’t shrink the uncertainty interval, it didn’t really change the bet, no matter what the new point estimate is.

So: precision vs accuracy is about what you’re aiming at. Odds vs certainty is about how well you can read the target. Both have to land for the bet to actually move.

6. Reversibility is a spectrum, not a binary

Once you’ve got the math, the next question is when the cost of the bet justifies skipping validation altogether and just running the bet itself. Jeff Bezos popularized a frame for this that’s useful as a starting point and incomplete as a finishing point.

Bezos’s distinction: Type I decisions are one-way doors. You walk through, you can’t come back. Type II decisions are two-way doors. You walk through, look around, walk back out if you don’t like it. Type I decisions deserve heavy deliberation, abundant safeguards, and the involvement of senior people. Type II decisions should be made fast, by whoever’s closest, with minimal ceremony.

The frame is useful. It’s also too coarse, because reversibility isn’t binary. It’s a spectrum, and the spectrum is relative to the resources of the person making the decision.

A $50,000 spend on the wrong cloud architecture is a two-way door for Amazon. They can rip it out, eat the loss, move on, and barely notice it on the quarterly report. The same $50,000 spend is a one-way door for a bootstrapped solopreneur with three months of runway. It will end the business. The action is identical. The reversibility is not.

This cuts the other way too. A solo lawyer who shifts her practice focus, then reverses two years later, takes a revenue hit and a small reputation dent. Recoverable, even if uncomfortable. The same pivot at an established firm with decades of brand equity attached to its specialty can echo for ten years. Small players have more reversibility room than the prevailing advice assumes, and over-validating a recoverable bet is its own form of stalling.

Here’s what I realized: I prefer the word “recoverable” over “reversible.” Recoverable forces you to ask the harder question: by whom, with what resources, on what timeline. A decision is recoverable when, if it goes wrong, you can absorb the consequence and keep playing. A decision is unrecoverable when you can’t.

The implication for validation is direct: your validation budget is set by your specific position, not by some universal rule about what kinds of decisions are “big” or “small.” A move that big companies treat as a small experiment may be a bet-the-company move for you, and deserves the validation budget of one.

This is where both camps break. The “always validate” crowd applies a uniform validation standard regardless of how much money and time the bet costs. The “just build” crowd applies a uniform build standard regardless of whether the consequence is recoverable.

7. A risk register you can actually use

Now the practical part. Once you’ve named the bet, sized it relative to your resources, and accepted that validation has its own cost, you need a way to decide which risks to attack first.

The version I keep coming back to is a simple risk register, scored on two axes.

List out the assumptions your bet depends on, grouped into the three categories from earlier:

Desirability: do people actually want this?
Feasibility: can we actually build and deliver it?
Viability: can we deliver it at a price that works for both sides?

For each assumption, score it on two things: how important is it (does the bet collapse without it?), and how certain are you that it holds (today, with what you actually know, not with what you hope)? Then drop them into a 2x2 matrix.

High importance, low certainty. Rigorous experiments live here. These are the load-bearing unknowns; if anything is worth validating, these are.
High importance, high certainty. These belong on the product roadmap. Continuing to validate here is the most common form of stalling.
Low importance, low certainty. Light touch: informal conversations, light reading, cheap experiments. Manage the uncertainty without spending much. Learn slowly while protecting capital.
Low importance, high certainty. Backlog. Possibly cut from v1 entirely. If it’s not load-bearing and you already know how it lands, it doesn’t need much from you.

The register’s job is to direct your validation budget to the assumptions where validation actually moves the bet. Spending it on the low-importance quadrants while the high-importance/low-certainty one is still in shadow is one of the most expensive forms of busy-work available to founders.

8. So when do you build, and when do you validate?

At this point the framework hopefully writes the answer for itself.

You build when:

The cost of running the full bet is low relative to your resources, and
The consequence of the bet failing is recoverable for you specifically, and
The validation activities available to you wouldn’t meaningfully shrink your uncertainty (often because no good proxy exists, or because the only way to get real signal is to actually run the bet).

In this regime, the cheapest validation is the bet itself, and pretending otherwise is stalling.

You validate when:

The cost of running the full bet is high relative to your resources, or
The consequence of failure is unrecoverable for you, or
There exists a validation activity that would meaningfully shrink your uncertainty interval at a much lower cost than the full bet.

When you do validate, you validate against behavior, not opinion. You weight the result by how much it actually collapsed your uncertainty, not by whether it gave you a number you liked.

Whatever the result, validation resolves into one of three calls: pivot, persevere, or terminate.

Pivot when the validation told you the target was wrong but the underlying need is real.
Persevere when the validation moved the odds in your favor and shrunk the uncertainty enough to keep playing.
Terminate when the validation produced a high-confidence negative. This is the hardest call and the most underrated. A clean kill is one of the most valuable outputs validation can produce, and most founders treat it as failure rather than success.

The same logic extends to feasibility and viability risks, even though I’ve focused on market risk throughout. The math is the same. Only the experiments change.

9. Back to the trap

So back to where this started. The three questions I kept circling (when to build instead of validating, when to validate instead of building, when whatever I’m doing is just stalling) are unanswerable in the abstract. They become answerable the moment you can name the bet, write down the actor-trigger-environment-action chain, estimate the cost of the bet relative to your own runway, and ask honestly whether the validation activity in front of you would shrink the uncertainty more than it would shrink the bank account.

The trap isn’t choosing wrong between building and validating. The trap is treating either one as a default virtue instead of a tool to use when the math says so. Both camps that opened this essay are right inside their own assumptions and wrong outside them. The work is figuring out which assumptions apply to your bet, today.

Two real bets, one week, opposite calls.

Container ship voyage: validate. The cost-to-validate is 10% of the cost-to-build, the certainty gap is enormous, and the only way to close it is to actually see the industry from inside. Booking the passage.

Book business: build. The cost-to-validate runs more than half the cost-to-build, the marginal certainty I’d gain isn’t worth the time, and a working product is itself the best validation tool I’ll have for the next round of conversations. Already in.

Same person, same week, opposite calls. That’s the framework doing its job.

I’d love to hear how you think about this, especially if you’ve felt the same trap. The framework is something I’m still pressure-testing, and I read every comment.