In the blog post about Function Calling LLMs for SAP: Structured Outputs and API calling by @MarioDeFelipe, Mario has done a great job in explaining the why and how of LLM's structured output with Function Calling for SAP API integration.
On 6 Aug 2024, OpenAI introduces Structured Outputs in the API in the new gpt-4o-2024-08-06 with a strict enforcement option for model's JSON output to be compliant with a given JSON schema. Weeks ago, MetaAI releases LLaMa 3.1 with multilingual capabilities, long context(128 K token) and tool usage capabilities(function call) etc. Also noticeably, Back in May 2024, Mistral unveils mistral v0.3 with function call support. In this blog post, we will discuss structured output with JSON mode and Function Calling in LLMs, walk through the function call with these llama 3.1 and mistral v0.3 with sample using Ollama(BYOM) in SAP AI Core, and examine some general use case patterns of LLM's Function Calling in SAP domain.
LLM's Structured Output out of Unstructured Input is essential for its integration into the realm of business applications like SAP. Simply, the unstructured data in real life like customer review text, product image, service call audio etc. are very difficult for business applications with formal and explicit rules to process, which are designed to handle structured data in certain schema like customers, product, sales order etc. Now with LLM/LMM's capability of understanding and processing these unstructured data, it is possible to bridge the gap between the unstructured and structure data. More importantly, model outputs adhered to the given JSON Schemas ensure reliable integration of business applications with LLMs.
In practice, there are several approaches to extract structured output from unstructured input with LLMs, take OpenAI's GPT-4 for example:
Prompting Alone: Instruct the LLM to output in JSON with in-context learning. For example, we have discussed about Prompt Engineering for Advanced Text Processing on Customer Messages. However, it doesn't guarantee valid JSON response every time. As OpenAI's evals of output with complex JSON schema, gpt-4-0613 scores less than 40%
Structured Output (strict=false)
OpenAI's evals of structured output with complex JSON schema
Okay, that is Structured Output with OpenAI. How about the other vendors?
Open-Source community
In general, both function call and JSON mode could be used to produce structured output in LLMs. With strict option of structured output introduced recently in OpenAI API, both can ensure the conformity to a desired schema of JSON output. However, the validity and conformity to a specified schema of JSON output may vary in vendors or models. Here are some differences between JSON mode and Function Calling
With JSON mode, you know exactly what to do with the unstructured input and define a desired output JSON schema, handing them over to the LLM for processing and producing the structured output, which enables further integration with your business applications. For instance, we can integrate GPT-4 Chat API with SAP CAP for Advanced Text Processing in Customer Message with JSON mode, such as sentiment analysis, message summary and entities extraction of involved customer, product, transaction for further integration with SAP S/4HANA Cloud.
With Function Calling, you can have several functions or tools automatically selected by LLMs, while each function has a different purpose and a specified JSON schema for its arguments. This gives extra flexibility in application integration. For example, function calling can be very useful in chatbot development. An intent of conversation can be well represented as a function call in LLM, no additional corpus training required for intent identification in conversation, for LLM can automatically pick the best fit of function call for the input text with its description and extract the structured arguments for further application integration to produce a more contextual and accurate reply in the conversation. Another example is to orchestrate the process automation with function calls, which can identify and route the downstream tasks with different APIs or tools.
In the rest of the blog post, we'll focus on function calling. Since there are already heap of blog posts or articles about function calling with OpenAI, I will showcase function calling with Open-Source LLMs with Ollama in SAP AI Core, however, the same can be applied to the proprietary models in SAP Generative AI Hub.
Next, let's move on how to custom function call in the open-source LLMs, namely LLaMa 3.1 and Mistral 0.3. And we'll just focus on the custom function call instead of built-in function call. To make it easy, we'll use Ollama as open-source LLM inference server and the basic and popular sample of function call about get current weather from weather API. Ollama supports function call with its chat api(recommended), OpenAI compatible chat completion api(recommended), and completion api in raw mode.
Let's take the example of answering the question "What is the weather today in Melbourne, Australia?" with real data. The diagram illustrates the flow of between LLM, A Custom Weather Chatbot as Orchestrator and Weather API as APIs Service Providers.
0. User asks a question through chatbot about "What is the weather today in Melbourne, Australia?" or many other ways to ask the same question, like "Is it rainy in Melbourne?", "Is it raining in Melbourne?" etc.
# test llama3.1 and mistral v.03's function call with ollama
import requests, json
# for ollama in SAP AI Core, please chat_api_endpoint and headers accordingly
chat_api_endpoint = 'http://localhost:11434/api/chat'
headers = {'Content-Type': 'application/json'}
question = "What is the weather today in Melbourne, Australia?"
model = 'llama3.1' #'mistral'
json_data = {
"model": model,
"messages": [
{
"role": "user",
"content": question
}
],
"stream": False,
"format": "json", #enable JSON mode to assure valid json response for function call
"tools": [
{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Get the current weather for a location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The location to get the weather for, e.g. San Francisco, CA"
},
"format": {
"type": "string",
"description": "The format to return the weather in, e.g. 'celsius' or 'fahrenheit'",
"enum": ["celsius", "fahrenheit"]
}
},
"required": ["location", "format"]
}
}
}
]
}
response = requests.post(url=chat_api_endpoint, headers=headers, json=json_data)
print('Result:', response.text)
The response looks like:
{
"model": "llama3.1",
"created_at": "2024-08-01T06:35:31.535917Z",
"message": {
"role": "assistant",
"content": "",
"tool_calls": [
{
"function": {
"name": "get_current_weather",
"arguments": {
"format": "celsius",
"location": "Melbourne, Australia"
}
}
}
]
},
"done_reason": "stop",
"done": true,
"total_duration": 8040050292,
"load_duration": 5870310667,
"prompt_eval_count": 139,
"prompt_eval_duration": 711313000,
"eval_count": 50,
"eval_duration": 1456128000
}
We'll parse the result from custom function call to identify the function as "get_current_weather" and extract its required arguments "location" and "format". Then invoke the 3rd party weather API call with the given location and format.
# parse the json response to retrieve the location and format, to be passed to 3rd-party weather API
resp_json = response.json()
func_dict = resp_json['message']['tool_calls'][0]['function']
func = func_dict['name']
args_dict = func_dict['arguments']
location = args_dict['location']
format = args_dict['format']
print('Function:', func)
print('Location:', location)
print('Format:', format)
# service fulfillment by 3rd-party API with given location and format... for example, let's assume the 3rd party API returns a json weather condition like this. we'll instruct the llm to answer the question with this service response
def get_current_weather(location, format):
# Your actual API call goes here...
response = { "condition": "Rainy", "temp_h": 15, "temp_l": 7, "temp_unit": "C" }
return response
service_resp = get_current_weather(location, format)
service_resp_str = json.dumps(service_resp)
It return the real data in JSON like:
{ "condition": "Rainy", "temp_h": 15, "temp_l": 7, "temp_unit": "C" }
In this final step, we'll instruct LLM to generate the answer to the original question with the API response as context.
# answering the original question with the service response as context
user_msg = """
context: {}
Answer the question with context(weather API response in json) above including weather condition as emoji and temperatures range: {}?Be concise.
""".format(service_resp_str,question)
json_data = {
"model": model,
"messages": [
{
"role": "user",
"content": user_msg
}
],
"stream": False
}
response = requests.post(url=chat_api_endpoint, headers=headers, json=json_data)
resp_json = response.json()
print('Final Response JSON:', resp_json)
The final answer is generated as: "🌧️ Today in Melbourne, it's rainy with a temperature range of 15°C to 7°C"
{'model': 'llama3.1', 'created_at': '2024-08-01T06:53:35.756Z', 'message': {'role': 'assistant', 'content': "️ Today in Melbourne, it's rainy with a temperature range of 15°C to 7°C."}, 'done_reason': 'stop', 'done': True, 'total_duration': 1216658166, 'load_duration': 16187791, 'prompt_eval_count': 75, 'prompt_eval_duration': 432991000, 'eval_count': 25, 'eval_duration': 765913000}
Of course, there are more steps required to handle the exceptions of the conversations, for example,
All these extra exceptions the conversation could be handled efficiently with the help of LLMs, for instance,
JSON mode and Function Calling are very useful for integration with business application for processing unstructured input such as customer review, customer message, service call audio etc.
You may know that SAP has released Joule as digital assistant across all product lines. However, Joule may not be available for external users like customers, contingent worker etc. of a business.
Now that we have seen Function Call with LLMs can be very helpful in chatbot development. Similarly, we can replace the weather condition question with business questions, weather API with APIs to SAP systems, such as SAP S/4HANA Cloud, SAP Sales Cloud, SAP Service Cloud etc. In this way, we can complement or extend Joule with some custom chatbot scenarios for external users with Function calling of LLMs.
For instance, in a customer self-service chatbot use case, as a customer, you can help yourself with
"what is the delivery status of my order 198?",
"what is my account balance?",
"The descale light is solid on my coffee machine with series no xxxxx, what should I do?"
...
The diagram just for illustration of flow.Delivery status may need consideration on items and orther factors
Another use case pattern could be using Function Call of LLMs to identify user question, email, support ticket etc with its target tool (process or automation) and extract its required information as structured output, and route it to different downstream automation process or human intervention.
Let's have a look at an example, we have an unattended bot in SAP Build Process Automation monitoring the email account of customer service as a customer service digital orchestrator,
Agent Framework like autogen from Microsoft, MetaGPT or crewAI etc gain their popularity, which aims to solve complex tasks with role-based multi agents autonomously and collaboratively. As illustrated in last diagram above, it is possible to route a downstream task to multiple agents with their role and responsibility clearly defined, let the agents work together towards the final goal. However, it is still in very early stage to use autonomous agents in real business due to limited capability of planning and reasoning in current LLMs, safety or trust issues of autonomous decision, and the complexity of business decision itself etc.
In a complex use cases of hundreds of function call involved, it is impossible to send all the function calls as a tool list option to LLMs, then it makes sense to shortlist or filter the function calls into a few as a pre-process step before relay the function calling request to LLMs, there are several ideas as thoughts of food for you to explore:
As we have seen from the samples, JSON mode and Function Calling are very helpful in turning unstructured inputs(like customer review, service ticket, chat, email, service call audio etc.) into structured output like JSON output compliant with a supplied schema, integrating API calls in business applications or 3rd-party system, generating answers with real data from API calls in Chatbot, or invoke a downstream task with extracted structured output for automation etc. Structured Output with JSON mode or function calling are available in most of popular LLMs like latest gp4-o from OpenAI etc, Cloude from Anthropic and Gemini from Google etc. With Ollama(BYOM) in SAP AI Core, now we can also leverage the function calling with open-source LLMs like LLaMa 3.1 and Mistral 0.3 etc.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.