№ 15 org Nov 08, 2024 · 11 min read

The AI team that reported to product shipped. The one that reported to research didn't.

Reporting structure determines what gets built. AI teams that report to product build products. AI teams that report to research build papers. Choose the one you need.

We have worked with about a dozen AI teams over the past few years. The single strongest predictor of whether a team ships production AI is not their talent, their budget, or their model choice. It is who they report to.

Teams that report to a product org ship products. Teams that report to a research org ship papers, prototypes, and demos that never quite make it to production. This is not a judgment on research — it is an observation about incentive alignment. And most companies get it wrong.

The reporting structure shapes everything

Your reporting structure determines four things that matter more than any technical decision your AI team will make.

What gets prioritized. A product-reporting team prioritizes features that users need. A research-reporting team prioritizes problems that are technically interesting. These overlap sometimes. They diverge often. When they diverge, the reporting structure breaks the tie.

How success is measured. Product teams are measured on shipped features, user adoption, and business metrics. Research teams are measured on publications, novelty, and technical depth. An AI team that reports to research will naturally optimize for work that is novel and publishable. Production reliability is neither novel nor publishable.

How long work takes. Product orgs have release cycles. Sprints. Deadlines that are connected to revenue. Research orgs have horizons. Quarters. Goals that are measured in papers submitted and benchmarks beaten. The cadence is different. The urgency is different. A product team that needs an AI feature in six weeks will get it from a product-reporting AI team. A research-reporting AI team will say six weeks is not enough time to do it properly.

Who the team hires. Product-reporting AI teams hire ML engineers who can write production code, operate services, and debug at 3am. Research-reporting AI teams hire researchers who can publish, present at conferences, and push the state of the art. Both are valuable. They are not interchangeable.

The pattern we keep seeing

Here is how it typically plays out.

A company decides to invest in AI. They hire a senior researcher — someone with a strong publication record, maybe from a major lab. This person is given a team and a mandate: “build AI capabilities for the company.”

The researcher does what researchers do. They hire other researchers. They set up a research agenda. They pick interesting problems. They build prototypes. The prototypes are impressive. The demos go well. Leadership is excited.

Then someone asks: “When does this ship?”

The answer is always some version of “it’s not quite ready.” The model needs more fine-tuning. The accuracy isn’t high enough. The edge cases are tricky. These are legitimate technical concerns. They are also the concerns of a team that is optimizing for correctness over shipping.

Meanwhile, a product team down the hall needs an AI feature. They cannot wait for the research team’s timeline. They hire an ML engineer, use an off-the-shelf model, build a quick eval, and ship something in three weeks. It is not as sophisticated as what the research team is building. It works. Users like it. It generates revenue.

Six months later, the research team’s prototype still hasn’t shipped. The product team’s scrappy feature is in production, handling real traffic, getting better with every iteration. Leadership starts asking hard questions about the research team’s ROI.

This is not a failure of talent. It is a failure of org design.

The asymmetry

Here is the nuance that matters: you can embed a research function inside a product-reporting team, but you cannot embed a shipping function inside a research-reporting team.

A product-reporting AI team can allocate 20% of its time to exploratory research. Some sprints, an engineer investigates a new technique. They prototype it. If it works, it goes into the next sprint’s production backlog. If it doesn’t, the team learned something and moves on. This works because the default mode is shipping. Research is a controlled departure from the default.

A research-reporting AI team cannot allocate 20% of its time to “just ship something.” The culture, the incentives, the hiring profile — all of it resists production work. Shipping is not a controlled departure from their default. It is a fundamentally different mode of operating that the team is not staffed or incentivized for.

This asymmetry means the product-reporting structure strictly dominates for companies that need production AI. You get shipping by default and research as an option. The reverse gives you research by default and shipping as an aspiration.

The exceptions

Two situations where a research-reporting structure makes sense.

You are building foundation models. If your core product is the model itself — if you are OpenAI, Anthropic, or a similar lab — then research is the product. The reporting structure aligns because the research output is what ships. This does not apply to 95% of companies.

You have a genuine long-horizon research need. Some companies need to solve problems where no off-the-shelf solution exists. Drug discovery. Materials science. Autonomous systems. These require multi-year research programs. If this is your situation, a research-reporting structure is appropriate. But be honest about whether your AI needs are truly in this category. Most are not. Most companies need to apply existing models to their data, not invent new ones.

How to restructure

If you have a research-reporting AI team and you need production AI, here is the migration path we have seen work.

Step 1. Move the team’s reporting line to a product leader. VP of Engineering or VP of Product — someone who owns a P&L or a product roadmap.

Step 2. Change the team’s success metrics. Replace publications and benchmarks with shipped features, user adoption, and production reliability. Do this explicitly and in writing.

Step 3. Expect turnover. Some researchers will leave. This is not a failure. They joined a research team, and the team is becoming a product team. The ones who stay are the ones who want to ship. These are the people you want.

Step 4. Backfill with ML engineers. People who have run models in production. People who know what an SLA is. People who have been on-call for an ML system.

Step 5. Keep a research allocation. 10–20% of team time for exploratory work. This retains the researchers who stayed and keeps the team’s technical edge. But it is time-boxed and it reports up through product.

This transition takes about a quarter. It is uncomfortable. It works.

The hybrid that doesn’t work

Some companies try to solve this with a matrix structure — the AI team reports to both research and product. Dotted lines. Dual metrics. Shared goals.

We have never seen this work. Matrix structures create ambiguity about priorities. When the research lead wants the team to spend a sprint investigating a new embedding architecture and the product lead wants them to ship a retrieval feature, who wins? In a matrix, the answer is “whoever argues longer.” In a clear reporting structure, the answer is obvious.

Pick one. Make it product. You will ship more and regret less.

The heuristic

If your AI team has been operating for more than six months and nothing is in production, check the reporting structure. If the team reports to research, move it to product. If it already reports to product and still hasn’t shipped, you have a different problem — but at least you can see it clearly.

Reporting structure is not a detail. It is the decision that determines all the other decisions. Get it right first.

tl;dr

The pattern. AI teams that report to research optimize for novelty and correctness and never quite ship, while teams that report to product are forced to build things that work well enough to go live. The fix. Move your AI team’s reporting line to a product leader, change the success metrics from publications to shipped features, and protect a 10–20% research allocation for exploratory work. The outcome. You get production AI by default, and the research capability you kept funds the technical edge that stops your product from going stale.

// co-written with ai · edited by humans

← all field notes Start a retainer →

// related notes