The integration is harder than the model
Getting the model to produce the right output takes a week. Integrating that output into your existing systems, workflows, and user experience takes a quarter.
Here is how every AI project goes. Week 1: the team gets the model working in a notebook. The demo is impressive. The output is good. Everyone is excited. The PM asks when it can ship.
Then integration starts. And the project takes 3 more months.
Getting the model to produce the right output is the easy part. Taking that output and fitting it into your existing systems — your UI, your database, your permissions model, your audit trail, your error handling, your user workflows — that is the actual project. The model is the ingredient. The integration is the meal.
Where the time goes
The model returns a JSON blob. Now what? Here is a partial list of things that need to happen before a user sees anything:
UI integration. Where does the output appear? Does it replace an existing element or augment it? What does it look like when the model is loading? What does it look like when the model fails? What does it look like when the model returns low-confidence results? Does the UI need to support streaming? Does it need to show sources or citations? Does it need an edit/regenerate flow? Each of these is a design decision that requires mockups, review, and implementation.
Workflow integration. How does the AI output interact with the user’s existing workflow? If the model suggests an action, does the user approve it first or does it happen automatically? If the user edits the output, is the edit tracked? Does the AI output feed into a downstream process — an approval chain, a notification, a report? Does the workflow change depending on the confidence level?
Data integration. Where is the output stored? Is it a first-class entity in your data model or a side effect? Does it need to be queryable? Searchable? Reportable? Does it need to join with existing tables? What is the schema? What happens to the output when the user deletes the input that triggered it?
Permissions. Who can see the AI output? Is it the same permission model as the input data? What if the model references documents the user doesn’t have access to? What if the model’s training data includes information that should be restricted?
Audit trail. In regulated industries — and increasingly in non-regulated ones — you need to know why the system produced a given output. What model version was used? What prompt? What context was retrieved? What was the input? All of this needs to be logged, stored, and retrievable. That’s a data pipeline.
Error handling. What happens when the model fails? Not just API errors — what about when the model returns a valid response that’s wrong? Who decides it’s wrong? What does the fallback look like? Does the user see an error state, or do they just not see the AI feature? What about partial failures — the model succeeded but the guardrail blocked the output?
Each of these is a workstream. Each one has edge cases. Each one requires coordination with teams that don’t report to you — design, infrastructure, security, compliance. The model was built by your team in a week. The integration requires half the engineering org.
Why teams underestimate this
Three reasons.
The demo was fast. The team showed a working model in week 1. Stakeholders anchored on that speed. “The hard part is done” — except it wasn’t. The hard part hadn’t started yet. The demo proved the model works. It did not prove the product works.
The model is the novel part. Integration is boring. It’s plumbing. It’s the same kind of work the team does for any feature — API endpoints, database migrations, UI components, permission checks. Nobody gets excited about a database migration. So nobody talks about it in planning. So nobody accounts for it in the timeline.
The edge cases are invisible. When you’re building the model, the happy path is all you see. The model takes input, produces output, the output is good. But in production, inputs are messy, outputs are sometimes wrong, users do unexpected things, and the system needs to handle all of it gracefully. The edge cases don’t surface until integration, and each one takes longer than you think.
Scope the integration first
The fix is counterintuitive: scope the integration before you scope the AI.
Before you pick a model, before you write a prompt, before you build an eval — map the integration surface. Ask:
- Where does the output appear in the UI? Draw it.
- What is the data model? Schema it.
- What permissions apply? Write them down.
- What is the error handling strategy? Define the states.
- What audit requirements exist? List them.
- What downstream systems consume the output? Enumerate them.
This exercise takes a day. It will save you a month. It reveals the actual scope of the project — not the model scope, the product scope. And it often reveals that the integration constraints should influence the model design.
If the permissions model requires that the AI output never references documents the user can’t access, that’s a retrieval constraint — you need to filter at retrieval time, not at output time. If the audit trail requires reproducibility, that constrains your model choice — you need deterministic outputs, which means temperature zero and version-pinned models. If the UI needs to support edit/regenerate, that constrains your output format — you need structured output that the user can modify, not a wall of text.
The integration shapes the AI, not the other way around.
The timeline heuristic
Here is the rough breakdown we see across projects:
- Model selection and prompt engineering: 1-2 weeks.
- Eval suite development: 1-2 weeks.
- Integration — UI, data, permissions, errors: 6-10 weeks.
- Testing, edge cases, polish: 2-4 weeks.
The model is 10-20% of the project. The integration is 50-70%. The rest is testing and polish.
If your project plan allocates equal time to “build the AI” and “integrate the AI,” you are going to miss your deadline. Double the integration estimate and you’ll be closer to reality.
The organizational pattern
The other thing that slows integration down is organizational. The AI team builds the model. A different team owns the UI. A third team owns the backend. A fourth team owns the data platform. Getting these teams to coordinate on a feature that cuts across all of them — that’s a project management problem, not a technical one.
The best pattern we’ve seen: embed the AI engineer on the product team for the duration of the integration. Not as a consultant who reviews PRs — as a team member who writes code, attends standups, and pairs with the frontend engineer on the streaming UI and the backend engineer on the error handling.
The worst pattern we’ve seen: the AI team builds a “model API” and throws it over the wall. The product team integrates it without understanding its failure modes. The result is a brittle integration that breaks in ways nobody anticipated because nobody on the product team understood the model’s behavior, and nobody on the AI team understood the product’s constraints.
The heuristic
The model is the demo. The integration is the product. Scope the integration first, staff the integration with your best engineers, and plan for the integration to take 3-5x longer than the model work. If your timeline doesn’t reflect this, it reflects wishful thinking.
tl;dr
The pattern. A working model in a notebook takes one week, stakeholders anchor on that speed as “the hard part is done,” and nobody accounts for the UI states, data model, permissions, audit trail, error handling, and cross-team coordination that make up the actual product. The fix. Scope the integration before you scope the AI — draw the UI, schema the data model, list the permissions and audit requirements — so the integration constraints shape the model design rather than fighting it after the fact. The outcome. Projects that budget 50-70% of their time for integration ship on schedule instead of discovering a quarter of unplanned work the moment the model hands off to the product team.