The build-vs-buy decision nobody wants to make
Build your own AI system or buy a vendor solution. Most teams agonize over this for months while doing neither. Here is the framework that cuts through it.
Build or buy. The question comes up in every AI engagement we do. And every time, the team has been discussing it for weeks — sometimes months — without making a decision. They have a spreadsheet with pros and cons. They have had three meetings about it. They have a Slack channel called #ai-vendor-eval with 400 messages and no conclusion.
Meanwhile, they have built nothing and bought nothing. The opportunity cost of indecision is the cost nobody puts on the spreadsheet.
The one-sentence framework
Build when the AI is your product. Buy when the AI is a feature in your product.
That is the framework. Everything else is detail. But the detail matters, so let’s walk through it.
If your company’s competitive advantage comes from the AI itself — if the model’s performance is what makes customers choose you over the alternative — you should build. You need to control the training data, the model architecture, the evaluation criteria, the deployment pipeline. Outsourcing your core differentiator to a vendor is outsourcing your moat.
If your company’s competitive advantage comes from something else — your distribution, your brand, your data, your relationships — and the AI is a capability that makes your product better but is not the product itself, you should buy. You do not need to be world-class at AI infrastructure to add a summarization feature to your app. You need to be world-class at the thing that actually makes you money.
Most teams are in the second category and think they are in the first. This is the main source of bad build decisions.
The hidden costs of building
Building looks cheap on the whiteboard. You have engineers. You have data. The models are open-source. How hard can it be?
Here is what the whiteboard does not show.
Maintenance. A model in production is not a feature you ship and forget. It is a system that degrades. Data distributions shift. User behavior changes. The model that worked in January gives subtly worse results by June. You need monitoring, alerting, and a retraining pipeline. This is not a one-time cost — it is a permanent line item.
On-call. When the model starts producing bad output at 2am — and it will — someone has to debug it. AI failures are not like software failures. There is no stack trace. The model is not “broken” — it is confidently wrong. Debugging requires someone who understands the model, the data, the evaluation criteria, and the production environment. That person is expensive, hard to hire, and miserable if they are on call alone.
Model upgrades. The foundation model you built on today will be obsolete in 18 months. When the next generation ships — faster, cheaper, more capable — you need to evaluate it, migrate to it, re-run your evals, update your prompts, and regression-test everything. This is a project every time it happens, and it happens constantly.
Eval infrastructure. You need to know if your system is working. That means building an evaluation framework — test sets, metrics, automated runs, dashboards. The eval infrastructure is often as much work as the model itself. Teams that skip it do not know when their system breaks. Teams that build it spend significant engineering time maintaining it.
Opportunity cost. Every engineer working on AI infrastructure is not working on your product. If AI is not your product, this trade-off is probably wrong.
The hidden costs of buying
Buying looks expensive on the contract. But the hidden costs are not in the contract — they are in the constraints.
Vendor lock-in. Once you integrate a vendor’s API, switching costs are real. Your prompts are tuned to their model. Your data pipeline feeds their format. Your team’s expertise is in their platform. Switching means rebuilding, re-evaluating, and re-deploying. Most teams never switch, even when a better option appears, because the switching cost is too high.
Data residency. Your data goes to the vendor. Where does it go? What jurisdiction? Who can access it? Is it used for training? These questions matter — especially in regulated industries. The answers are in the terms of service, which change. You are making a data governance decision every time you send a request to a vendor API.
Customization limits. The vendor’s model works for the general case. Your use case is not the general case. You need it to handle your domain’s terminology, your customers’ phrasing, your company’s specific edge cases. The vendor gives you a prompt and a temperature slider. That might not be enough. And if it is not enough, your options are limited — you cannot fine-tune their model, you cannot modify their retrieval pipeline, you cannot change their output format beyond what the API exposes.
Pricing changes. The vendor’s pricing today is not their pricing next year. API costs drop — good for you. But platform fees, enterprise tiers, and per-seat pricing tend to move in the other direction. You are betting on the vendor’s incentives aligning with yours over a multi-year horizon. Sometimes they do. Sometimes they do not.
The hybrid approach that usually works
The answer for most teams is neither pure build nor pure buy. It is: buy the foundation, build the last mile.
Use a vendor for the base model — the language model, the embedding model, the reranking model. These are commodities. They are getting cheaper and better every quarter. Building your own foundation model is almost certainly not a good use of your resources unless you are a very large company with very specific requirements.
Build the parts that are specific to your business — the data pipeline that feeds your domain knowledge into the system, the evaluation framework that measures performance on your use cases, the integration layer that connects the model to your systems, the prompt engineering that encodes your business logic.
This is where the value is. The base model is the same for everyone. The last mile — the data, the evals, the integration, the prompts — is what makes your system work for your business. You own that. The vendor owns the commodity underneath.
In practice, this means you might use OpenAI or Anthropic for the model, build your own retrieval pipeline with your data, write your own evaluation suite with your domain experts, and maintain your own prompt library that encodes your business rules. The vendor provides the intelligence. You provide the judgment.
The honest self-assessment
Before you decide, answer one question honestly: do you have the team to build and maintain this for 3 years?
Not build it once. Build it, maintain it, improve it, debug it, upgrade it, and keep it running — for 3 years. Because that is the minimum commitment. AI systems are not projects. They are products. They need ongoing investment. They need people who understand them. They need a roadmap.
If you have a team of 2 ML engineers and they are also doing data science for the marketing team, you do not have the team to build. If you have a team of 6 with dedicated ML engineering and MLOps capability, you might.
The question is not “can we build it?” Teams can build almost anything given enough time. The question is “can we build it and maintain it better than a vendor can, while also doing everything else we need to do?” For most teams, the honest answer is no. And that is fine. That is what vendors are for.
The decision matrix
Ask these four questions. If you answer “yes” to 3 or more, build. Otherwise, buy the foundation and build the last mile.
- Is the AI your core product differentiation — the reason customers choose you?
- Do you have a dedicated team (3+ engineers) who will own this for 3+ years?
- Do you have data or domain constraints that make vendor solutions unworkable?
- Is the total cost of building (including maintenance, on-call, upgrades) less than 2x the vendor cost?
Most teams answer “yes” to 1 or 2 of these. That is a buy signal, not a build signal. The sooner you accept that, the sooner you ship.
tl;dr
The pattern. Teams spend months debating build-vs-buy while doing neither — burning runway on indecision instead of shipping. The fix. Build when the AI is your product, buy when it is a feature — and for most teams, the right move is to buy the foundation model and build the last mile of data, evals, and integration. The outcome. You ship in weeks instead of quarters, your engineers work on your actual product, and you preserve the ability to switch vendors when the market moves.