Artificial Intelligence and Machine Learning Blogs
Explore AI and ML blogs. Discover use cases, advancements, and the transformative potential of AI for businesses. Stay informed of trends and applications.
cancel
Showing results for 
Search instead for 
Did you mean: 
MarioDeFelipe
Active Contributor
3,625

Because I say so. That was easy.

No, serious, we just close June and I was lucky enough to participate in both Sapphire in Orlando and Barcelona, brilliant discussions around AI. I will summarize and structure my conversations in this blog.

AI Agents (for Dummies)

As already discussed in this community, AI Agents are applications we build on top of an LLM in which we give instructions and expect an outcome. Thats it.

It’s also valuable to highlight what an AI agent is not:

  1. Scripted: agents, by my definition, do not follow a pre-determined sequence of steps or tool calls, because the agent is responsible for choosing the right tool call to make the next
  2. Black Box: agents can and should show their work in the same way that a human would if you delegated tasks to them

Successful AI Agents are gaining traction of the industry, like search agents (Perplexity). In the SAP world, an Agent is built to complete a specific outcome, like Maintenance Planning Agent .

Agents are the coolest wrappers on top of LLMs these days because, yes, Agents are limited by their model(s). The underlying model(s) you use is the brain of your agent’s body. If the model sucks at making decisions, there is no way to end up in anything good.

In fact, there is a trade-off between over spending our time in AI Agents vs Fine Tuning small models on specific tasks. 

Back to my topic, if you look for documentation, there is a lot, but for simplicity, Agents are programmed with Planning, Tools, and Memory (I am missing Instructions because its not coded, and I am oversimplifying)

Screenshot 2024-06-29 at 11.25.29.png

For each of the blocks described above, we have a lot of coding, and that's why I deliberately removed Instructions because Instructions are not coded.

This means in order to build our agents, we use a Framework.

An AI Framework (like CrewAI, Amazon Bedrock, Microsoft AutoGPT, or SAP Generative AI Hub) provide us the mechanism to develop the Agents, but if you look at the differences between the Frameworks, essentially we can summarize the differences in two blocks;

 

Screenshot 2024-06-29 at 11.48.29.png

 

Screenshot 2024-06-29 at 12.21.27.png

 

# 1. The Memory

Back to the origin of this blog, why we should be building our SAP AI agents on BTP and BTP only, Memory and Tools have a fundamental weight in the success of an SAP Agent scenario.

Langchain, LlamaIndex or CrewAI provide sophisticated memory systems for AI agents that includes Short term and long term memory, as well as shared memory or contextual memory, but all of them are FIFO queues based on Function Calling, summarizing the memory management there are two options in discussion

  • Online Memory: indicates whether a solution can dynamically construct a prompt being fed to the model in real-time based on the agent’s past memory, external knowledge, and current user prompt. Then the memory is inserted in prompts.
  • Offline Memory (Reflection): indicates whether a solution can reflect on an agent’s past memory to learn experiences, distill knowledge, remove unnecessary sentences, etc. This is more complex as it requires mechanisms to read and write past memory.

Langchain, LlamaIndex or CrewAI provide sophisticated memory systems for AI agents that includes Short term and long term memory, as well as shared memory or contextual memory, but all of them are FIFO queues based on Function Calling for memory updates

Screenshot 2024-06-29 at 16.36.31.png

So as long as the model can read Timestamps, it will be Ok. Only AutoGPT's memory uses vector databases like Weaviate, Milvus, Redis or Pinecone for efficient persistent memory retrieval, it saves agent actions for future recall and learning.

But the common of all the frameworks I presented, including AutoGPT, is they are standalone frameworks, they dont belong to our Enterprise Application scenarios which could be the case of Amazon Bedrock or SAP Generative AI Hub.

Amazon Bedrock can leverage the ultra fast DynamoDB for this, while at SAP Generative AI Hub still does not possess memory, we can leverage the langchain.memory class combined with HANA Cloud Vector, for data memory. If you're familiar with word embeddings and text embeddings, leveraging  memory type stores vector representations of the conversation, enabling efficient retrieval of relevant context using vector similarity calculations.

# 2. The tools

I have described tools quite extensively in the past so I will talk about some new conversations about what is the difference between RPAs and AI Agents.

Screenshot 2024-06-29 at 18.00.16.png

In the architecture of agentic AI, the primary building blocks are the agents, and the business environments they interact with. Each agent operates [semi-]autonomously, perceiving its environment, reasoning about its circumstances, making decisions, and taking appropriate actions, making it really challenging to imagine we will build our agents in standalone frameworks, they must belong to an Enterprise Platform like BTP.

These agents interact with multiple digital environments to achieve specific goals. Central to this system is as we discussed previously, the memory, a repository that allows for seamless communication and coordination among all agents. This shared memory serves as the hub where information, plans, and goals are exchanged, ensuring that each agent can contribute to and benefit from the collective knowledge and strategies, but also, the Data Stores. Data Stores are our data repositories, with a sort of Unstructured Data like text, images, Vector Stores, Structured Data or Knowledge Graphs. 

 

Building AI Agents: Lessons Learned (so far)

Our journey of building AI agents over the past year has been a roller coaster ride, and we're undoubtedly still in the early stages of this new wave of technology. Here's a brief overview of what I've learned so far.

  • Libraries are useful but make sure you fully own each call to a model, including what’s going in and out. When you offload this to a 3rd party library like CrewAI, you may loose control.
  • Automating or augmenting human knowledge work with AI agents is a massive opportunity, but building a great agent is not enough. Bringing an agent to production requires a significant investment in a bunch of non-AI components that allow your agent to actually work… this is where you can create competitive differentiation.
  • security: AI agents should only run with the access and control of the user directing them. In practice, this means jumping through a of OAuth integration ,Single Sign-On. Think about this as you think of an App, it will be the end user calling the app.
  • data connectors: AI agents mostly need live data from systems to work. This means integrating with APIs and other connection protocols, frequently for both internal and 3rd party systems. These integrations need initial build out and TLC over time and thats why choosing your Framework is going to be crucial not to waste your time. We have wasted our time a lot this year
  • long-term memory: AI agents by default will only remember the current workflow, up to a maximum amount of tokens. Still today, long-term memory across workflows requires committing information to memory and retrieving it via tool calls or injecting memories into prompts. 
  • Evaluation: Using or building a framework to evaluate your AI agent is necessary but frustrating. Agents are intentionally nondeterministic, meaning that based on the direction provided, they will look to come up with the best sequence of tool calls available to accomplish their task, reasoning after each step. The AI Engineer must then assess the agent regarding responses and reasoning process at both design-time and runtime. Specifically, developers need to build up an evaluation pipeline, for instance, by defining specific scenario-based requirements, metrics and expected outputs from agents. Given particular context, the agent evaluator prepares context-specific test cases test cases (either searching from external resources or generating by itself), and performs evaluation on the agent components respectively. Evaluator frameworks like Inspect AI
  • Fine-tuning can still be a very practical instrument in your toolset. A successful example of this is employing a fine-tuned model to manage specific tool calls made by one agent. Imagine having a model fine-tuned to write SQL queries based on your specific data, in your database. Your agent, powered by a robust reasoning model without any fine-tuning, can use a tool call to signal its intention to execute a SQL query. You can then forward this to a separate task managed by your model that's been fine-tuned on SQL queries for your specific data.
Labels in this area
Top kudoed authors