Generative AI for SAP Part IV. LLMs orchestration ...
Artificial Intelligence and Machine Learning Blogs
Explore AI and ML blogs. Discover use cases, advancements, and the transformative potential of AI for businesses. Stay informed of trends and applications.
I insist that the future of SAP BTP lies in the language models, how they interact with code generation, and how we interact with code generation using LLMs to solve some fundamental challenges. A historical and growing challenge exists for an SAP engineer with new SAP Products/Solutions/Service releases, how they are renamed, and what makes a service different. And the cloud (this is not SAP's unique challenge, and it is shared among the world's leading IT shops) is like throwing oil into the flame. Back in the day, an engineer could only solve a problem with the tools available to that particular customer; in the cloud, you don't buy but consume, so this brings an additional challenge: Which solution should I use? LLMs can help here. And how we interact with BTP, AWS, or Azure is on the edge of a complete revolution.
As Language Models get new versions and evolutions, they present limitations; for example, they don't do math calculations. For this, we can tell a language model they don't know how to do a math calculation, so instead, they can call a math program, receive the output, and continue preparing their particular response.
For us to get an understanding of how incredibly fast this is all going, in January 2022, the Chain of Thought paper was released. The paper introduced a new concept called Chains, a series of intermediate reasoning steps. This was the foundation for the new frameworks like LangChain, released in October 2022.
However, after some experimentation, researchers discovered that if LLMs lack access to the external world and cannot update their knowledge, this can lead to issues like fact hallucination and error propagation.
In October 2022, Yao et al. introduced ReAct, where LLMs generate reasoning traces and task-specific actions interleaved. Using ReAct, a Model can induce, track, and update action plans, handle exceptions, and interact with external sources like knowledge bases or environments by generating reasoning traces. This enables LLMs to retrieve additional information, leading to more reliable and factual responses.
To make it more interesting, in May 2023, Tree of Thoughts (ToT), an algorithm that combines Large Language Models (LLMs) and heuristic reasoning, was presented in this paper by Princeton University and Google DeepMind. It appears that this algorithm is the one behind the most promising release by Google this fall, called Gemini, a multimodal generative AI, the Swiss knife of the LLMs who, by its investment from Google, is called to revolutionize the way we understand LLMs, because it conceptually combines this new self-supervised learning with the features of the hierarchical process features from deep learning.
A few weeks ago, a research paper was released. The paper was called Consciousness in Artificial Intelligence, and in the Abstract Itself, they mentioned a captivating phrase.
No AI is conscious, but there is no barrier to build it.
How is that possible? Let's talk about ReAct
ReAct Introduction
Generating reasoning traces allows the model to induce, track, and update action plans and even handle exceptions. The action step allows to interface with and gather information from external sources such as knowledge bases or environments.
The ReAct framework can allow LLMs to interact with external tools to retrieve additional information, leading to more reliable and factual responses.
Overall, the authors found that the best approach uses ReAct combined with chain-of-thought (CoT) that allows the use of internal knowledge and external information obtained during reasoning.
In 2022, Yao et al. introduced ReAct, where LLMs generate reasoning traces and task-specific actions interleaved. ReAct is a methodology allowing LLMs to generate reasoning traces and task-specific actions interleaved. The model can induce, track, and update action plans, handle exceptions, and interact with external sources like knowledge bases or environments by generating reasoning traces. This enables LLMs to retrieve additional information, leading to more reliable and factual responses.
How ReAct Works
ReAct is inspired by the synergies between "acting" and "reasoning," which allow humans to learn new tasks and make decisions or reasoning.
On the LLM side of this bridge, there is an enormous ‘black box’, as we can see, that is a probabilistic system, an understanding, and a generation of language. On the code side of the bridge, we have the deterministic functions that software is built upon.
ReAct is a general paradigm that combines reasoning and acting with LLMs. ReAct prompts LLMs to generate verbal reasoning traces and actions for a task. This allows the system to perform dynamic reasoning to create, maintain, and adjust plans for acting while enabling interaction with external environments (e.g., Wikipedia or Google Search) to incorporate additional information into the reasoning.
Based on ReAct, the most modern frameworks like LangChain, LlamaIndex, or Haystack, just to name a few, introduce the concept of Agents.
Agents can be seen as applications powered by LLMs and integrated with tools like search engines, databases, websites, etc.
Within an Agent, the LLM is the reasoning engine that, based on the user input, can plan and execute actions needed to fulfill the request.
Using Agents follows a fundamental concept. Tools
Tools are interfaces that an agent can use to interact with the world. When constructing your own agent, you must provide a list of Tools it can use.
source LangChain
The ReAct framework can allow LLMs to interact with external tools to retrieve additional information, leading to more reliable and factual responses.
As seen below in the sequence of events of a ReAct-based Agent, reasoning traces make the final result of the LLM more interpretable with various references along the thought process.
For SAP, we still don't have LangChain, Llama Index, or Haystack tools, but we have two options: we can directly call AWS Lambda, which subsequently will call an SAP endpoint (a BTP application or S/4HANA Odata exposed), or we can build our custom tool.
Define our own Tools
Depending on our goal, the prompt template takes the user’s input into a more helpful format. We want the agent to perform a reasoning procedure of the type: process the question, think about what action to take, take that action, reason about the output of the action, and evaluate whether you have the answer or need to repeat the cycle. For this, we need to inform the agent what the tool is about and when it should use it. If we want to define our own tools, we can simply assume the tool will accept a single query string and return a string output, But if the tool function requires multiple arguments, we better use the StructuredTool class or similar. The Agent is the “wrapper” of everything. It is the application with the logic of an LLM and the capabilities of “moving around and doing things” with the tools provided.
Tree of Thoughts
Similar to ReAct, ToT addresses a fundamental limitation from the inability of the LLMs to reason a response. ToT allows Language Models to form deliberated decision-making by considering multiple reasoning paths and self-evaluating choices to decide the next course of action and looking ahead or backtracking when necessary to make global choices.
The 'Tree of Thoughts' and 'React' are key concepts used in different capacities in the broader umbrella of Language Learning Models (LLMs). 'Tree of Thoughts' tracks the structure and flow of a conversation, while 'ReAct' determines how the model reacts to the user inputs. They can be easily combined.
By a few Phyton instructions, the goal is to allow the assistant to keep track of the structure of the conversation, each node in the tree representing an exchange in the conversation, and these nodes are connected following the flow of the conversation.
PromptTemplate defines the Tree of Thoughts prompt, and the chain is implemented at each step.
from langchain.chains import LLMChain
from langchain.llms import Bedrock
from langchain.prompts import PromptTemplate
from langchain.chat_models import BedrockChat
from langchain.schema import HumanMessage
I have a problem related to {input}. Could you brainstorm three distinct solutions? Please consider a variety of factors such as {perfect_factors}
A:
"""
For each of the three proposed solutions, evaluate their potential. Consider their pros and cons, initial effort needed, implementation difficulty, potential challenges, and the expected outcomes. Assign a probability of success and a confidence level to each option based on these factors
For each solution, deepen the thought process. Generate potential scenarios, strategies for implementation, any necessary partnerships or resources, and how potential obstacles might be overcome. Also, consider any potential unexpected outcomes and how they might be handled.
Based on the evaluations and scenarios, rank the solutions in order of promise. Provide a justification for each ranking and offer any final thoughts or considerations for each solution
{think}
print(overall_chain({"input":"Improve my user networks", "perfect_factors":"I need to improve the user to machine network based on the most effective technology available to connect to my AWS tenant"}))
An here all the ranks without being boring
Conclusions
In SAP, there have always existed many ways to solve one problem. The growing number of existing services in cloud frameworks like BTP is adding some complexity that requires specialized Solution Architects to design a solution that could solve a particular problem.
LLMs have introduced a new capability by coordinating and orchestrating which decision they need to take next and how the model can interact with that tool to get the answer the problem needs. Solving tasks becomes significantly easier when using ReAct as it only requires editing a few thoughts, enabling new forms of human-machine collaboration.
In this blog, I present ReAct, a simple yet effective method for synergizing reasoning and acting in language models. Through various experiments focusing on multi-hop question-answering, fact-checking, and interactive decision-making tasks, ReAct leads to superior performance with interpretable decision traces. ReAct paper was the foundational idea of Agents. A decision LLM that is a revolutionary piece of modern NLP interfaces like LangChain, AWS Bedrock, or Hugging Face.
With Agents, we are entering a new phase of LLMs. Agents are getting handy thanks to their integrability with external tools and sources and their ability to reason step by step to solve complex tasks. Agents provide an “acting” capability to Large Language Models, paving the way to a new wave of use cases and innovative applications.
After ReAct, a team of researchers from Princeton University and Google DeepMind introduced Tree of Thoughts, a new framework designed to enhance the inference capabilities of language models by enabling these models to explore coherent units of text, referred to as “thoughts,” which serve as intermediate steps towards problem-solving.
The introduction of the ToT framework has significantly enhanced the problem-solving abilities of language models, which have shown in recent research that fall short in search or planning capabilities. Language models with the ToT framework demonstrated superior performance on three novel tasks requiring non-trivial planning or search. I also introduced how ReAct and ToT can be implemented in phyton.
As several Models will need to be combined to find a suitable solution, I believe that introducing Agents, Tools, Chains, and Tasks in frameworks like BTP could dramatically enhance our capabilities as Solution Architects, Developers, and Engineers in code generation and problem-solving.