visionenterprise

Enterprise AI Ends at the Demo

The Zaro Team· July 1, 2026· 7 min read

The demo always lands. Three months later, nobody uses it.

The pattern repeats enough times that it stops being surprising. What is surprising is how consistently it gets diagnosed wrong.

The conversation after a failed rollout almost always goes the same way. The model was not good enough. The use case was too niche. People are resistant to change.

All of that might be true at the margin. None of it is the actual problem.

The actual problem is that there is nothing beneath the surface.

Broken Loops

Picture a sales team. They've just had an impressive demo of an AI assistant. It can summarise calls, draft follow-ups, pull deal data. In the room, everyone can see it working. Somebody says "imagine if this knew our playbook." Somebody else says "or our customer history." The room gets excited.

Then the rollout happens.

The agent shows up with no memory of the last conversation. It doesn't know which accounts are sensitive. It can't see the notes from last quarter's QBR. It doesn't know that a particular enterprise prospect had a bad experience in 2024 and needs careful handling.

Every session starts from zero.

The sales rep has to re-explain context every time, which means the agent is not saving time. In fact, it's creating a new kind of manual work.

The rep goes back to the spreadsheet. The tool gathers dust.

This is not a model failure. The model was fine. The feedback loop broke at the organisational boundary, because there was no shared place for organisational knowledge to live and persist. The agent was operating in a vacuum.

It's The Same Story

Not every company is at this point.

If you've never tried AI at scale, this reads as an abstract warning. If you are building everything yourself, you already know it.

The companies that feel it most acutely are in the middle. They deployed a tool: an enterprise AI assistant across the workforce, a pilot with a particular team, a custom solution someone stood up in a few months. They got early enthusiasm. A handful of heavy users. Then adoption plateaued. And you can see why.

Each department, each team, becomes an island. The AI knows the finance team's processes, but can't see what the ops team learned. A salesperson prepares for a call with context from her own account history, but the AI can't access the institutional knowledge that someone in another region built up. The tool is context-blind to everything outside its silo.

Sales, Engineering, Marketing, and Operations sit in separate boxes with the connections between them crossed out - "You just get islands."

The people who are not using it say the same things: it doesn't know our business. It gives generic answers. I still have to do the actual thinking.

These companies have something the AI-curious do not: a comparison point. They have tried, and they know something is not working. The problem is rarely the model.

An Infrastructure Not Built

The model half of AI got radically better, roughly every six months, for years. Capabilities improved faster than most organisations could keep track of, much less stay on top of.

The infrastructure half barely moved.

Shared context, orchestration, governance: these are the things that allow AI to compound across an organisation rather than reset every session. And almost nobody built them.

At Convergence, and then at Salesforce on Agentforce, our co-founders Michael Bajwa and Qian Zheng watched this up close.

The demos were always impressive. Genuinely. The models could do remarkable things in controlled conditions. The deployment story was a different conversation. Every enterprise had the same questions: where does our data live? How do we control what the agent can see? What happens when two agents work on the same problem and contradict each other? Who audits the outputs?

There were no clean answers, because the infrastructure did not exist in a coherent form. Every team was bolting something together. The model wasn't the bottleneck. It hadn't been for quite some time. The bottleneck was everything else.

Layers Matter

AI infrastructure needs three things to work organisation-wide.

The first, and most important, is shared context. Think of it as a community pot. Every agent reads from it. Every agent writes to it.

It's versioned, so you can see how knowledge changed over time. It's permissioned, so the right people and agents can access the right things. It's model-agnostic, so organisations are not locked into one provider.

Slack, Salesforce, Notion, HubSpot, and Drive feed into one versioned, permissioned context layer that Apps, Shared Memory, and Agents all read and write - "One living context layer."

It holds everything a company actually knows: processes, customer history, institutional memory, the things that currently live in someone's head or in a document nobody can find. When an agent does something useful, that goes into the pot. When a human corrects an output, that goes in too. The system gets smarter because it is accumulating, not resetting.

This is what Qian Zheng, our CTO, means when he talks about context. It is the sum of connected tools, data, and organisational memory that an agent operates within. Prompt engineering sits at the surface. Retrieval is part of the mechanism. Context is the layer underneath both: designed, maintained, built by the whole organisation one interaction at a time.

The second layer is orchestration. Different tasks need different models. Running everything through a frontier model means paying frontier prices for work a lighter model handles perfectly well. With proprietary model routing, organisations can achieve roughly a 10x cost reduction without meaningful quality loss. For enterprise AI to scale, the economics have to work.

The third layer is application building. Existing files and workflows become the inputs. The output is something that fits how the team actually works: a CRM tracker that knows the sales methodology, a briefing tool that knows which clients need which framing, a dashboard that understands the escalation logic. Applications built on specific context, not generic features that require adaptation.

The point that matters most: agents and applications operate on the same context layer simultaneously. Every interaction enriches it. The loop closes. It compounds. Instead of isolated islands, you've created one continent. Intelligence compounds across every interaction rather than resetting.

The Compound Benefit

Take a traditional organisation: a thousand people, somewhere in professional services or pharma or financial services. They have deployed enterprise AI tools across the workforce. And they are running into a familiar wall.

Adoption has plateaued well below projections. The heavy users love it. The rest say the same things: the context is siloed by user, there's no shared knowledge, governance is a spreadsheet someone in IT maintains manually. The per-seat cost is real and growing. The compound value is not materialising.

In traditional industries, there is an additional pressure: being publicly seen as behind on AI carries its own cost, independent of ROI. Companies deploy tools partly to signal capability, not just to use them. That pressure is real. But signal without infrastructure closes nothing.

Now flip the scenario. Same organisation. The infrastructure is right.

A junior analyst runs a competitive briefing. The agent pulls from market data, from proprietary client notes, from the pricing strategy documented last quarter. The output is specific enough to be useful. The analyst corrects one section. That correction goes back into the context layer. Next time someone runs a similar briefing, the starting point is better. The analyst who left six months ago? Their institutional knowledge did not leave with them. It persisted.

A salesperson prepares for a renewal meeting. The agent knows the account history, the previous objections, the internal notes from customer success, the deals that went sideways with similar customers. The prep takes ten minutes instead of two hours. The next salesperson to work that account inherits everything.

A chart of effort, time, and error risk against complexity of the work - Zaro's line stays flat while everyone else's climbs steeply - "Our edge widens as work gets more complex."

The organisations that feel this most acutely are not always the largest. The sharper signal is how many disconnected data sources they are running on. A company holding together a CRM, an ERP, project management, email, and documentation platforms feels this pain regardless of headcount. The number of seams matters more than the size.

The value scales not with headcount but with the number of connected systems and workflows you operate on. More systems, more outputs, more people - and Zaro's value wedge grows. This is the real wedge: organisations with complex infrastructure benefit most.

Every interaction makes the next one smarter. That is compounding. It doesn't happen automatically. It happens when the layer underneath is built to make it possible.

The Shared Secret

There is a discipline here that the industry has been practising without publicising it.

Context engineering. The practice of designing, managing, and compounding the information AI systems act on.

Prompt engineering feeds into it. Data science feeds into it. But context engineering is the ongoing work of deciding what AI systems should know, how that knowledge should be structured, who can access and modify it, and how it accumulates over time. It's the difference between an AI that performs in a demo and an AI that compounds across an organisation.

What makes context engineering the decisive variable in enterprise AI is that model quality stops being the bottleneck. A good model with poor context produces poor outputs. A mediocre model with excellent context produces excellent outcomes. The model is the thing everyone can see and compare and benchmark. The context layer is invisible until it is not. And once it is not, you will wonder how you ever operated without it.

Every team is currently running on tools that are almost just right, held together with manual work and the quiet agreement that this is just how it is. That agreement is getting old.

The teams that define what comes next will not be the ones that bought the most expensive software. They will be the ones that built exactly what they needed, and built the infrastructure to make it compound.

That work starts with context. It always did.