Articles

Articles

Articles

How to get your agents into production

February 13, 2026

Executive Summary

We've been working on taking agents to production for about 2 years now. Some of our agent experiments fail while others generate incredible ROIs. You've probably heard that many AI agent experiments fail as well. In this guideline, I'll cover what we've learned in the field from our roughly 2 years of working with agents, the misguided approaches of CTOs, CEOs and CIOs, and how you can successfully position agents within your workflows. I'll explain these concepts by providing examples from agentic transformations currently used in production within enterprise operational processes. I'll try to explain everything in a simple yet comparative and detailed manner, supported by visuals. What I'm about to share applies to agentic transformations in enterprise operations teams. The agent world is truly vast, and some of what I say may not fully reflect the agentic situation at a startup.

First: How Should AI Agents Be Positioned in Enterprises and What is the Human Role?

AI Agents will eventually be integrated into every company's processes. However, we will see them most mainly within the internal processes of large enterprises. While a startup will generate savings from agents, it will never match the scale of savings achieved by an enterprise. The reason for this lies in the current performance levels of agents.

Agent capabilities regarding context management and memory are not yet sufficiently strong. Their current limitations in context management and memory prevent them from handling multiple distinct tasks simultaneously or managing dynamic responsibilities effectively.
Agents have already begun transforming repetitive back-office functions in enterprises. These are typically repetitive, highly defined tasks where every step is determined by clear guidelines. When assigned such tasks, agents can complete them successfully. However, when considering an SME or a startup, rarely does anyone have a single, fixed role.

Everyone is often required to handle multiple responsibilities at once. This demands extensive context and information management, such as:

  • What is the latest status in our company?

  • Has there been a change in our product?

  • Have we been able to collect payment from the client?


99% of the operations present in a large company have not yet begun in a small company. Because agents cannot yet manage context and information as effectively as humans, they cannot perform independent, constantly changing, dynamic tasks at the desired level. Even if they attempt them, the quality of the output does not satisfy human standards.
If you assign agents repetitive work with clearly defined steps, you can achieve outputs close to the desired quality. Is it as good as a human? No, but it is sufficient for the completion of the task.

So, will agents perform these stable tasks fully autonomously? No.
In enterprise workflows, human error rates are very low. Agents, however, do not yet possess the same success rate; they cannot generate outputs as consistently and reliably as humans. For this reason, humans will serve as the operators of agents. Agents will be positioned not as employees within the company, but as technologies that enhance the end-to-end efficiency of teams.

This means that humans, not the agents, will be responsible for what the agents produce. For example, when a sales team member uses an agent to prepare a price quotation file for a client, they must verify that the price is written correctly and that critical sections are adjusted properly. Just as finance teams double-check their work, outputs in all our agentic workflows will be subject to human verification.

There is a significant amount of "noise" surrounding AI Agents. From our work deploying agents to production in enterprises over the last two years, we see that expectations for AI are incredibly high. Due to this noise, expectations are often exaggerated and the technology is not positioned correctly.

We believe agents are a major shift, providing an efficiency increase unseen since the Industrial Revolution. For enterprises, there is significant efficiency and competitive advantage to be gained. However, current expectations are like expecting the first cars produced after the Industrial Revolution to be instantly perfect: people expect agents to work flawlessly, as if by magic.

The idea that agents will completely replace teams in enterprises, that work will be completed without humans, or that fully automated structures will emerge does not seem feasible in the near future. We do not consider approaches such as "agents evaluating other agents' outputs" or "agents forming their own teams" to be correct or reliable for full autonomy yet.
We know that trials of fully autonomous agents - especially those using small and medium-sized local models - will not be successful. Currently, while working with fintechs at Upsonic, we have to use local models due to data privacy. However, local models fall short in complex tasks that require large model capabilities, such as agent teaming.

Expectations of fully replacing teams and achieving total automation are illusions driven by hype. The modern marketing world loves "overpromised claims." The reality is that agents will not eliminate teams entirely, but they will enable repetitive tasks in your enterprise workflows to be performed by 3 people instead of 10.

This may sound like a minor detail, but it represents an incredible increase in efficiency. Humanity last saw such a leap during the Industrial Revolution. Agents offer this opportunity for a second time but without eliminating the role of the human.
Human intelligence, approach, creativity and the ability to synthesize independent concepts remain unique. Shortly: in enterprises, humans will operate agents to multiply their efficiency.


What Is AI Agent Orientation?

In the world of Human Resources, orientation programs are considered critical processes for ensuring new employees adapt to the organization. The goal of this process is to enable teams to deliver successful outputs within the company's workflows in the shortest possible time.
We must design similar processes for AI agents, involving technical teams and domain experts. This is essential because every company's workflow is unique.

Even in the SaaS landscape, enterprises have always demanded custom solutions for their specific workflows. They will expect this same standard from vertical agents designed for enterprise operations. Due to the subtle nuances and localization requirements inherent in their processes, enterprises will require even greater customization in agentic solutions.
To better understand Agent Orientation, I consider this analogy to be highly accurate:
Think of an employee transferring from a competitor or arriving from a different geography. The speed at which they can begin their assigned tasks and the quality of the output they produce is directly comparable to how successfully an external agent solution will deliver results within your company.


What is the AI agent orientation and integration process?

AI Agent orientation and integration is the process of transferring specific knowledge to the agents we develop, ensuring they operate successfully within our workflows. This involves teaching them how tasks are executed, what requires specific attention, and the criteria for successful completion.


Currently, when a new member joins our teams, they undergo a standard orientation and onboarding program. Through this program:
1-The details of how departmental work is performed are conveyed.
2-The methods for completing tasks and the specific points requiring attention at each step are demonstrated.
3-Prohibited actions ("what not to do") and the correct response protocols for negative scenarios or exceptions are explained.

Without this program, a human employee’s output would lack sufficient quality, and their communication style would fail to reflect the company’s tone and objectives. AI Agents require this exact same orientation process.

During integration, technical teams must collaborate closely with the Team Lead of the department where the agent is being deployed. They need to explicitly transfer knowledge on how work is done, specific attention points at every step, and protocols for handling exceptions. Without this detailed transfer, agents cannot be successfully taken to production.
Furthermore, even after deployment, minor optimizations are often required. Ultimately, the process of taking AI agents to production is more complex than typically assumed.

What to teach agents during orientation and integration?

Workflows and Standards

Detailed Operating Procedures: Just as documents are processed in human workflows, agents must be explicitly taught the step-by-step processing actions and critical attention points for every document. Without these details, agents will generate inconsistent results.

  • Quality Control Criteria: Success metrics and quality standards must be clearly defined for each document type.

  • Exceptions and Edge Cases: Instructions on how the agent should output data when encountering unusual or non-standard scenarios.

  • Error Scenarios: Agents must be versed in common error states and the standard responses required for each.


Just like humans, AI agents require detailed guidance, clear instructions, and comprehensive context to succeed. An orientation process designed through the collaboration of technical teams and domain experts ensures that AI agents operate reliably, efficiently, and consistently in production.
Without this systematic approach, agents will exhibit variable performance. When developing software, it is essential to carefully consider the user experience, usability and strategic positioning in detail.


Now, let’s take a look at agents developed using different approaches.

Let’s examine both the successful and failed designs of a billion-dollar agentic workflow

First, let’s analyze this specific use case, which is common in most financial institutions.

Case: Incoming Court Requests and Notifications Banks and fintech companies frequently receive official requests and notifications from courts. These documents typically concern legal processes such as asset attachment, divorce proceedings, or requests for financial information. In these instances, courts send requests or petitions to enterprise institutions regarding their clients' data.

There are 3 common scenarios encountered in these notifications:

  1. Corporate Debt and Attachment: Due to Firm X’s debt to its creditors, the court requests an attachment (lien) be placed on their accounts, a report on the current account balance, or a temporary freeze on the funds.

  2. Divorce and Asset Division: During the divorce proceedings of Persons X and Y, a precautionary measure (legal restriction) is placed on the parties' bank accounts to ensure the proper division of assets.

  3. Information Requests: A request to provide information regarding Person Z’s bank accounts, e-money accounts, or wallets due to a lawsuit filed against them.

In this use case, incoming emails arrive with .zip attachments. Therefore, the files must be downloaded, extracted, and processed to execute the necessary actions.

We will now highlight the differences by examining this workflow through two lenses: a successful implementation and a failed agentic scenario.

How can we achieve near-perfect accuracy in this workflow using AI agents?

  1. Categorization: It is necessary to categorize and separate every type of writ and lawsuit received from the court.

  2. Action vs. Information: In some cases, courts request information only, while in others they require specific actions (e.g., placing a lien). These existing scenarios and document types must be broken down into smaller components.

  3. Internal Process Mapping: The specific internal actions taken by the Bank for each distinct case type must be identified and mapped.

  4. API Access: To enable agents to respond to court writs, they must be granted API access to the Core Banking applications or third-party tools currently used by human staff.

  5. Response Structure Analysis & Codification: The structure of responses sent to the court was analyzed. We explicitly coded detailed instructions for the agents regarding which email template to use and how to edit it for each specific scenario. A distinct agent task was executed for each situation, assigned sequentially in stages. Separate structures were established for each individual workflow.

  6. Email Integration: An integration must be developed to receive emails sent by the court into your system. We built a dedicated email integration for this purpose.

  7. Prioritizing User Experience (UX): Attention was paid to UX at every step. Incoming .zip files were automatically downloaded and text was extracted. The user expended zero effort retrieving files from the court. This specific task was solved via backend integration, not by the agent!

  8. Execution via API: The actions to be taken -whether placing a lien, providing information, or preparing specific data sets- were completed by making requests to the banking application. If an API did not exist, the bank's internal team developed one.

  9. Human-in-the-Loop Experience: The results of the agents' operations were presented within a user interface designed for human control: allowing for Approval, Rejection, or Retry. Alternatively, the system allowed the human to execute the action personally. Essentially, humans can take full manual control whenever necessary.

  10. Software First: We did not use agents in any scenario where standard software would suffice. MCP (Model Context Protocol) was not used for integrations.

Now, let’s apply the "AI Hype" playbook to deliberately compromise this workflow. We will break the user experience and demonstrate how to make agents completely unusable.

The success of this case will be compromised if the following critical errors are committed

  • Limited Scope: If incoming court writs are merely classified without executing the subsequent workflow steps such as placing the attachment, then only a limited portion of the use case has been transformed.

  • Unnecessary Use of MCP: Instead of writing a direct integration to the Core Banking Application’s APIs for the agent's output, we choose to use MCP. By introducing an LLM in place of code that would operate with 100% certainty, we have effectively sacrificed certainty.

  • Overloading the Agent: Instead of executing the workflow stages step-by-step, we ask the agent to complete the entire process in a single run. This means assigning classification, case detail extraction, and action execution to the agent all at once. Because the agent cannot maintain focus on so many distinct tasks simultaneously, the probability of hallucination is significantly increased.

İmportant Note

Horizontal LLM applications are taking priority over vertical applications. You might use a generic classification tool or a document parsing solution like LlamaIndex. While these agentic tools complete a specific portion of the work, they lack specialization.
Ultimately, instead of achieving a 100% agentic transformation, you achieve only a limited, partial transformation. This is because you have designed only a specific segment, rather than an end-to-end process.

To illustrate this within our case: After the classification step, we could have triggered requests to the banking application to retrieve the necessary data or execute actions. Instead, we merely analyzed the content of the downloaded .zip file and classified it. Effectively, we limited the scope to classification only. Furthermore, no actual operations such as retrieving the requested information or executing the lien attachment, were performed using the Core Banking Application.

As a result of these strategic missteps, this billion-dollar use case is destined to fail. This excellent agentic initiative will be cancelled. Here is why:

  1. Agents were used in places where standard software would have sufficed.

  2. Because no requests were sent to the Core Banking APIs, only 30% of the task was completed. We performed only classification; we did not design an end-to-end process.

  3. MCP was used unnecessarily.

  4. The User Experience (UX) of the human operating the agents was completely ignored.

  5. Since the tools used by humans were not provided to the agents, the agents could not perform the full range of tasks that humans do.

Ultimately, a highly viable agentic use case failed because it was poorly designed. Consequently, Board Members and C-Level executives were led to believe that agents are merely "hype" and cannot be successfully deployed to production.

Who should design Agents

Agentic transformation must be led by a Senior Developer or Senior Data Scientist with a genuine interest in agents. This individual requires a combination of experience and curiosity; if either trait is missing, your success rate will diminish. Furthermore, they must have a proven track record of solving real-world problems through coding.
However, many companies assign agent-related tasks and experiments to interns or junior developers. This stems from a reluctance among senior developers to learn about agents, a belief that it is merely "AI hype," and an underlying fear of failure.
Yet, rest assured: only a fraction of agentic transformation is strictly about LLMs and agents. The vast majority requires software architecture knowledge, development expertise, and the fundamental ability to solve real-world problems through code.

The Experience Delivered by AI Agents

I will not delve into the details of AI Agent UX here. However, those responsible for developing and deploying agents must ensure that the agentic transformation does not add friction to existing workflows.
Yet, in many real-world implementations labeled as "agentic transformation," we observe that assigning the task to an agent actually complicates the process. Companies often introduce steps the user would not normally take, force the adoption of tools unnecessary to the natural flow and ultimately extend the cycle. There are numerous examples of this counter-productive approach.

Our software applications already possess an established User Experience (UX). However, regarding Agent UX, the rule is simple: as tasks are automated, the processes must simultaneously become shorter.

If the current workflow lengthens the process for humans - for instance, requiring them to manually send downloaded .zip files to a central repository and then wait for results to take action - this experience must be automated.
In such transformations, standard software must be utilized alongside agents where appropriate. For example, a dedicated email integration should automatically download .zip files, extract the content, and feed it as text to the LLM.

If a human currently performs a task in 5 steps, the agent must reduce it to 3.
Enterprises often mistakenly believe that agent design relies solely on model performance. However, the reality is that alongside model capabilities, Architecture, Agent UX and UI are equally critical components. Vendors and developers often skip the correct implementation process because they want to avoid delaying the launch or slowing down the purchase.

Successful implementation requires:

  1. deeply understanding how a human executes the task,

  2. designing custom workflows tailored to each enterprise,

  3. granting agents access to the APIs humans currently use,

  4. establishing the correct UX and architecture,

  5. and selecting the appropriately sized LLM.

Finally, a common error I observe in the field regarding Agent Experience is the lack of diligence applied when agents are built for teams that the company considers "low priority." Every implementation demands the same level of architectural care.

The Relationship Between Digital Transformation and AI Agents


Enterprises have been striving to achieve digital transformation for years. This is a profound and challenging journey. Digital transformation constitutes the foundational infrastructure for agentic transformation. Consequently, when agentic transformation is attempted in sectors or regions where digital maturity is low, the success rate inevitably declines.

Digital transformation is an essential component of agentic transformation. When transforming a specific workflow into an agentic one, it is mandatory to allocate time to address digital transformation gaps within that workflow.

This is because only 30% of agentic transformation is related to LLMs. The remaining 70% revolves around digital transformation, software architecture, user experience (UX) and identifying the correct agentic use case. In fact, companies often end up executing their core digital transformation under the label of "agentic transformation."

Furthermore, the slow pace of decision-making within enterprises exerts a negative impact. Developing the agents themselves does not actually require significant time, compare it to the timeline of a standard software project. However, internal decision cycles, document delivery, testing phases, team availability and holiday schedules significantly extend the time-to production for agents.

If, during this prolonged process, your company or consultancy selects the wrong agentic use case -assigning a task to an agent that it should not be performing- the initiative typically ends in failure, dismissed with the narrative that "agents are just hype."

During Agentic Transformation, Agents Should Not Be Used Where Standard Software is Sufficient

A common error is frequently observed in companies undergoing agentic transformation. Regardless of seniority, practitioners commit the same mistake: attempting to use agents for tasks that should be completed via standard coding and software engineering: tasks that fundamentally belong to the realm of digital transformation and do not require LLMs.

There are two primary reasons for this:

  1. The "Magic Box" Perception: They view LLMs as a magic box. The media has constantly pushed this narrative for the past two years, leading people to expect LLMs to handle absolutely everything.

  2. The Difficulty of Enterprise Transformation: Executing transformation within an enterprise is very difficult. Companies expect agents to complete all necessary steps on their behalf to avoid internal friction. When API access, firewall clearances, or new API development is required within a corporation, obtaining permissions can take days, while actual development can take months. Consequently, companies want agents to handle the entire process autonomously to bypass these internal operational burdens.

What specific problems arise when we replace standard software with Agents or MCP?

Software is deterministic. The action intended for the next step is executed with 100% certainty. We do not encounter surprises or unexpected outcomes.
However, LLMs are not deterministic. They resemble humans more than machines which is precisely why they are such a significant technology. But this necessitates that we strictly define exactly what we expect to happen next: the specific expected output. We simply do not possess the 100% certainty with LLMs that we enjoy with standard software.
AI agents are inherently more fragile structures compared to traditional software.
When we choose to use an AI agent instead of standard software, we are effectively sacrificing reliability. AI agents inevitably commit errors during operation; by using them where they are not needed, you merely cause these error rates to compound further.

Consider a workflow consisting of 7 distinct steps. Assume that 4 of these steps can be solved using standard software, while the remaining 3 steps require AI agents.
Recall the previous example regarding emails sent from courts to the bank. I require an email integration. While I could use MCP (Model Context Protocol) to fetch these emails, I also have the option to develop a dedicated, hard-coded email integration.

  • The Software Route: Developing a custom email integration is more labor-intensive; however, it provides a guarantee that the workflow will not break. I can establish strong and reliable architecture.

  • The MCP/Agent Route: When utilizing MCP for this task, there is a probability that requests may occasionally fail or time out. Consequently, critical transactions might be overlooked. If the court does not receive the requested response regarding an individual or institution, the bank could face significant legal fines.


The Conclusion: If I employ agents for the 4 steps that should be handled by software, the success probability of the entire case diminishes.


However, by prioritizing standard software for these 4 steps, I ensure the certainty of those outputs. This leaves only the 3 steps that genuinely require reasoning which is the true domain of agents. I can focus my resources on designing the agents for these specific steps with much greater depth and precision.

To approach this from a different perspective: You do not expect humans to perform robotic tasks that should be handled by machines. We must view the capabilities of Agents through the same lens as we view human capabilities. This is why solving problems with standard software remains so valuable. Agents are not "Great Software + Reasoning" structures; they are strictly "Reasoning" engines.
Local Models and Privacy There is another crucial factor: Local Model Usage. If data privacy regulations require your company to use local models (on-premise or private cloud), you are effectively forced to solve a larger portion of the problem using standard software. When designing agents for local deployment, you must define the inputs and outputs with extreme precision.
Since we work with financial institutions, we are often mandated to use local models. This constraint forces us to design the workflow steps with much higher specificity to compensate for the lack of "giant model" flexibility.
We observe that integrating agents into existing workflows is not a trivial task. Achieving success within the inertia and sluggishness of an Enterprise is exceptionally difficult. When companies attempt to focus on areas outside their core business, they inevitably struggle to manage these initiatives effectively.

How Much Efficiency Do The Right Agentic Use Cases Deliver?

We have discussed at length the difficulties of deploying agents to production but once we successfully deploy agents to production, how significant is the ROI?

The magnitude of the ROI varies depending on the size of the company and the team. Every use case presents a different ROI scenario.

However, consider our current Client Onboarding Agents: Because the results can be validated against official APIs, the agents handle data verification, intelligence gathering and preparation for final action (tasks that typically require a 10-person team) with complete autonomy.

  • The Result: A process requiring 10 people is now completed by 2 people.

  • The Key Value: Most importantly, the process becomes fully scalable.


Similarly, in the banking use case mentioned earlier, the workload is dramatically reduced when agents are utilized.


While the specific efficiency gain varies by case, one fact remains absolute: As humanity, we have not encountered a technology that drives this level of efficiency increase since the Industrial Revolution. In the right use cases, agents generate incredible ROI.

Beyond simple efficiency, the true impact emerges in Competitive Advantage. Successful agent adoption allows an enterprise to redirect 30% of its capacity toward critical tasks that demand creativity and strategic thinking.

The Bottom Line: Companies currently using our solution can handle client onboarding with 2-3 people in a fully scalable manner. A company not using agents is forced to perform the same work with 10-20 people.

The work that we accomplish using Agents -with greater speed, lower costs, higher accuracy- is performed by our client’s competitors over longer timelines, at higher expense and with significantly more operational chaos.

The gap that companies using Agents are opening up against their rivals is really massive. You simply cannot survive against this level of competition. This is precisely why Boards of Directors are demanding an immediate acceleration of Agent-related initiatives.

FOR CIOs

Firstly, why do I specifically address CIOs? Because CIOs typically exist within very large enterprises; mid-sized companies rarely hold this specific C-level title. And it is precisely within these Enterprises where Agents generate the highest ROI.

I observe that many CIOs in the ecosystem do not believe in agents. The root cause is the excessive noise generated by the "AI Hype." On the topic of noise, I actually agree with them. Developers are highly skilled and well-equipped individuals; they have dedicated years to technological transformation, solved countless business problems, and built the technological infrastructure of the world we live in today.

My favorite trait of developers is their built-in "Bullshit Detector", they are instinctively cautious when they encounter something that doesn't add up. Due to the extreme claims and inflated expectations surrounding AI, CIOs are led to believe that agents are destined to fail. (Note: This is not based on academic research, but rather on my direct observations of attitudes within the companies I interact with.)

However, the reality lies somewhere in the middle: Agents are not yet successful enough to render humanity unemployed or make work obsolete, as the extreme hype suggests. But neither are they as incompetent or unusable as some CIOs fear.

In enterprises, particularly in operational processes and repetitive branch tasks, they deliver massive ROI. If you are in a company with over 1,000 employees, you can transform at least 20% of your workforce using the current capacity of Artificial Intelligence.

I see a similar issue among Senior Developers in the ecosystem. There are still developers who have never tried Cursor or written code with AI assistance. You may choose not to adopt it permanently, but you owe it to yourself to experiment with such technology for 1-2 weeks.

AI is currently being severely underestimated. The truth is, when you identify the right agentic use cases, they provide incredible ROI and a disruptive competitive advantage. We have personally experienced and proven this in numerous use cases we have taken to production.

Committees Established for Agentic Transformation

As Clausewitz famously observed: "The further the General is removed from the front, the more he is forced to see things through the eyes of others, thereby increasing the risk of error."
Since the field of AI Agents is so new, neither current executives nor existing teams have undergone formal AI training.
We do not yet possess definitive knowledge regarding AI best practices, the correct methodologies for designing agents, or what the standard approaches should be.
This know-how is currently being forged. Within the next 2 years, the collective body of knowledge, the methodologies, and the clear distinction between what is truly "agentic" and what is not will be fully established. Subsequently, this know-how will gradually begin to become standard.

Early-Stage Field and Frequent Mismanagement

Because the field is so new, everyone is acting based on the words of others.
With so much noise in the ecosystem, middle managers are confused. Senior executives simply demand, "I want an agent like this," without understanding the implications. This places the full responsibility on middle management, complicating the execution process significantly.
We often observe in meetings that managers lack even the most fundamental knowledge regarding AI agents. This is not their fault; it is a result of the field's novelty. We are all still learning.

However, the manager's single biggest error is failing to conduct their own research and instead delegating the entire process to a junior or mid-level employee.The institution ultimately pays a heavy price for this mistake.

The number of developers in new AI Committees is often insufficient. Yet, just as developers drove digital transformation in the past, they are essential for executing AI adoption today.

A Senior Developer with practical experience in building agents is an invaluable asset to the company.

You should have Agents Perform Human Work Exactly as Humans Do It

In tasks that humans do not currently perform, or where the boundaries are not clearly defined, the success rate of agents will inevitably be low.

During development, the surrounding noise often pushes companies to assign random or unsuitable tasks to agents. Stakeholders expect agents to handle tasks that exceed human capabilities or tasks that are so complex and unstructured that no human is willing to undertake them.

The correct approach when creating an agentic workflow is to replicate exactly how a human currently performs the task. You must mirror every step and every method a human uses to complete that assignment. All these specific details and process steps must be transferred into the agentic products.

Agents are not superior to or more successful than humans.If you cannot execute a task with humans, you cannot execute it with agents.

However, for stable and repetitive tasks that require human intelligence, agents can complete them with incredibly high efficiency.


















Any question? Talk with us.

Ready to explore how we can transform your operations together? Schedule a 30-minute discovery call today.

Any question? Talk with us.

Ready to explore how we can transform your operations together? Schedule a 30-minute discovery call today.

Any question? Talk with us.

Ready to explore how we can transform your operations together? Schedule a 30-minute discovery call today.