Skip to content
fiveleaf
Conversational AI·

10 min read

·For operators

What It Actually Takes to Build a Conversational AI Agent That Works

Most conversational AI projects fail because of architecture, not the model. A working teardown of what it takes to build an agent that resolves real enquiries, from someone who runs them in production.

Silviu Major·Founder, Fiveleaf··Updated

Most conversational AI projects do not fail because the language model was not good enough. They fail because nobody got the architecture right around it.

The model is the easy part now. A capable LLM is an API call away, and it has been for a while. The hard part is everything else: knowing who the customer is, reading the right knowledge at the right moment, taking an action in a system that was not built for AI, and handing a conversation to a human without the customer having to start again. That is where the work lives, and it is where most builds quietly come apart.

This is a teardown of what a working conversational AI agent is actually made of. Not the marketing version, the engineering version, written from the perspective of having built and run these things in production at a high-volume operation. If you are evaluating whether to build one, buy one, or have one built for you, this is the map of the territory.

What a conversational AI agent is, and is not

A conversational AI agent is software that holds a real conversation with a customer on a channel they already use, understands what they want, and does something about it. It listens, it understands, it acts.

It is not a decision-tree chatbot. That distinction matters more than any other in this article, so it is worth being precise.

A decision-tree bot follows a script you drew in advance. Press 1 for billing, press 2 for technical support. Every path is hand-built. The moment a customer says something the tree did not anticipate, the bot breaks, apologises, and offers to connect you to an agent. These bots typically resolve under 20% of the enquiries they receive. The rest is press-zero frustration and a worse experience than no bot at all, which is why so many businesses switched theirs off within a quarter of turning them on.

A conversational AI agent works the other way around. It understands intent from natural language, reasons about what the customer actually needs, pulls the relevant information, and responds. There is no tree to fall off. The customer types or says what they want in their own words, and the agent works with it.

This is not a matter of degree. It is a different category of thing. If you take one idea from this piece, take that one.

The four layers every working agent needs

A production agent is not a single program. It is four distinct layers doing four distinct jobs. Get all four right and the agent works. Get three right and it fails in ways that are maddening to debug, because the failure usually shows up as "the AI said something wrong" when the real problem was three layers down.

Layer one: the conversational layer

This is the part people think of as the AI. It is responsible for understanding what the customer means and deciding how to respond. It holds the conversation, tracks context across multiple turns, and manages the tone and personality of the agent.

The build decision here is not which model to use. It is how you structure the conversation. A common and costly mistake is to treat the whole thing as one giant prompt and hope the model holds it together. It will not, reliably, at volume. Better builds break the agent into defined flows: a support flow, a sales flow, a verification flow, each with its own job, sharing a common spine. The customer never sees the seams. Internally, it is the difference between a system you can reason about and one you cannot.

Layer two: the knowledge layer

An agent that does not know your business is a generic chatbot wearing your logo. The knowledge layer is what makes the agent yours: your product information, your policies, your FAQs, your troubleshooting steps, your tone.

The naive version is to paste your help docs into a prompt. The working version ingests your knowledge base, your policy documents and your support history, keeps them current, and retrieves the right piece at the right moment rather than dumping everything into context every time. Think of it less as memorising a manual and more as a new hire who has read everything and knows where to look.

The unglamorous truth about this layer is that it is mostly maintenance. Knowledge goes stale. Policies change. Prices move. An agent built on a knowledge base that was accurate at launch and never touched again will be confidently wrong within months. The build is a moment. The knowledge layer is forever.

Layer three: the integration layer

This is where conversational AI stops being a clever chat window and starts being useful. The integration layer connects the agent to the systems where your business actually runs: your CRM, your billing platform, your helpdesk, your telephony.

Without it, the agent can talk but cannot do. It can tell a customer how to check their balance, but it cannot check it for them. With it, the agent can look up an account, verify identity, raise a ticket, book an appointment, apply a credit, or create a qualified lead in your CRM. The conversation becomes an outcome instead of a holding pattern.

This layer is also where most build time goes, and where the realistic timelines come from. Connecting to a modern system with a clean API is fast. Connecting to a legacy platform, or one where a single internal developer controls every endpoint, is where projects slow down. If you are scoping a build, the honest question is not how good the AI is, but how accessible the systems it needs to touch are. That answer determines your timeline far more than the model does.

In our own deployments, the slowest part of a build has never been the AI. It has been waiting on access to a client-side system that one person owns and is too busy to prioritise. The conversational logic was usually ready in days. The integration to pull a customer's real account data was what set the actual go-live date. We have learned to scope that dependency first and to build everything we can in parallel around it, because the system access, not the model, is the critical path almost every time.

Layer four: the handover layer

No agent should handle every conversation, and any vendor who promises that is selling you a future apology. The mark of a good agent is not that it never escalates. It is that it escalates well.

A good handover layer does three things. It knows when to escalate, because there is an explicit confidence threshold, and when the agent drops below it, it routes to a human rather than guessing. It routes to the right place, sending billing queries to billing and technical to technical, not all into one undifferentiated queue. And it hands over with full context, so the human picks up a conversation that already has the customer verified, the issue captured, and the transcript attached. The customer should never have to repeat themselves. Done right, they often do not even realise the conversation changed hands.

A bad handover layer is worse than no agent. "I'll connect you to someone who can help," followed by a five-minute wait and a human who asks for the account number again, is precisely the experience that makes customers hate chatbots. The handover is not an afterthought. It is half the product.

Why most builds fail

Lay the four layers side by side and the common failure modes become obvious.

There is the build that is all model and no integration. It demos beautifully and does nothing. It can discuss your return policy with great fluency but cannot process a return. Customers figure this out fast, and the trust never comes back.

There is the build that is all script and no understanding. This is the decision-tree bot in a nicer coat. It works for the three questions the builder anticipated and falls apart on the fourth. Resolution rate stays low, and customers learn to type "agent" immediately.

There is the build that launched and was abandoned. Internal projects are especially prone to this. The specialist who built it leaves, or gets pulled onto something else, and the knowledge layer rots while the integration layer breaks against a system update nobody reconciled. A large share of internal AI builds reach roughly 60% complete and freeze there, becoming a liability nobody wants to touch.

And there is the build with no handover discipline. It escalates everything, so it saved nobody any work, or it escalates nothing, so it strands customers in loops. Either way the numbers do not justify the spend, and it gets switched off.

The pattern underneath all four is the same. Someone treated conversational AI as a feature to ship rather than a system to operate. It is the second thing, not the first.

Build, buy, or have it built

If you are weighing this up, there are three honest routes, and they suit different businesses.

You can build it yourself. This is viable if you have, and can keep, in-house AI engineering talent. The risk is not the initial build, it is the operating. The four layers all need ongoing attention, and the moment your specialist leaves, the system starts to decay. Build internally only if conversational AI is going to be a permanent internal competency, not a one-off project.

You can buy a platform. Tools like the major DIY conversational platforms give you a toolkit. The catch is that a toolkit needs someone to wield it. You are not buying a working agent, you are buying the means to build one, plus the obligation to run it. For a business with the team to do that, fine. For a great many operators, the platform sits half-configured.

Or you can have it built and run for you. Someone designs, builds, integrates, hosts and continuously tunes the agent as a service. You get the outcome without the internal team or the platform learning curve. The trade is that you are dependent on a partner, so choose one who stays embedded rather than one who hands over a build and disappears. This is the model we run at Fiveleaf, so treat that as disclosure rather than neutral advice, but the logic holds regardless of who does it.

There is no universally right answer. There is a right answer for your business, and it turns mostly on one question: is running a conversational AI agent going to be a core internal capability for you, or is it something you want handled? Answer that honestly and the route picks itself.

What "working" actually means

A working conversational AI agent is not the one with the most impressive demo. It is the one that, six months after launch, is still resolving a meaningful share of real enquiries, still escalating cleanly when it should, still accurate because someone is maintaining its knowledge, and still connected because someone is watching the integrations.

That is an operational standard, not a technical one. The technology to build a good agent has been available for a while. The discipline to run one well is still rare. That gap, between what is possible and what is actually running in production, is the whole game.

If you build with the four layers in mind, stay honest about your integration timelines, and treat the agent as a system you operate rather than a feature you shipped, you will end up in the small group of businesses whose conversational AI is still on, and still earning its keep, a year later.


Fiveleaf designs, builds and operates AI agents that run inside mid-market and enterprise businesses, fully integrated, branded, and continuously tuned in production. If you want a working agent without building an internal AI team, book a call.

Frequently asked

How long does it take to build a conversational AI agent?
For a first agent integrated into existing systems, a realistic timeline is four to eight weeks, depending almost entirely on how accessible the systems it needs to connect to are. Subsequent agents on the same foundation are much faster, typically one to three weeks.
What is the difference between a chatbot and a conversational AI agent?
A chatbot follows a pre-built decision tree and breaks when a customer says something unexpected, typically resolving under 20% of enquiries. A conversational AI agent understands natural language, reasons about intent, pulls real information, and takes actions in connected systems.
Why do most conversational AI projects fail?
Usually because of architecture, not the model. The common failures are agents that can talk but not act because they lack an integration layer, agents that only handle anticipated questions because they lack real understanding, agents that were launched and never maintained, and agents with poor escalation. The model is rarely the problem.
Can a conversational AI agent integrate with my existing CRM and systems?
Yes, provided those systems expose APIs. Integration is what turns the agent from a chat window into something that can look up accounts, raise tickets, book appointments and create leads. The accessibility of your systems is the single biggest factor in build timelines.
Should I build a conversational AI agent in-house or have it built?
Build in-house only if conversational AI will be a permanent internal capability, because the ongoing operation matters more than the initial build. If you want the outcome without maintaining an internal AI team, having it built and run for you is usually the better fit.

If you want help building this

Building AI agents into a mid-market business is what Fiveleaf does.

Bespoke build, fully integrated, continuously optimised. A 30-minute discovery call is enough to tell you honestly whether AI agents fit your team right now, or whether you’re better off waiting six months. No pitch.

About the author

Silviu Major, Founder, Fiveleaf

Silviu Major

Founder, Fiveleaf

10+ years building automation systems inside enterprise SaaS, now applying that same operational rigour to AI implementation for mid-market businesses. Writes about what works (and what doesn’t) from inside live deployments, not from the outside looking in.

Connect on LinkedIn →

Keep reading