In my previous posts, I explored why enterprise AI needs a "reality check" through Secret Agent Shoppers and how the role of SAP Functional Consultants is shifting in a world where system configuration is no longer strictly deterministic.
Today, I want to get a bit more practical. How do we actually validate an SAP AI Agent to ensure it’s doing what it says it’s doing? Specifically, we will look at how to automate this validation using Int4 Suite.
Figure 1 - Int4 Suite validating SAP Joule Agent
Let’s imagine an AI Agent designed to handle Sales Orders. The workflow looks like this:
The Trigger: A customer sends an email requesting to post an order.
The Agent: SAP’s AI Agent receives the mail, interprets the intent, extracts the data, and posts the Sales Order into SAP S/4HANA.
The Completion: The customer receives a notification that their order has been processed.
On the surface, this looks great. But in an enterprise environment, "looks great" isn't enough. We need to validate two specific areas:
Soft Rules: Was the agent’s response professional? Did it include the order number? Was the language correct?
DB Validations: Did the agent actually post the data correctly in S/4HANA? We need to compare the database entries (customer numbers, material indices, quantities) against a reference document to ensure the AI didn't hallucinate or miss a field.
Here is how you set up this automated "reality check" within Int4 Suite.
First, we create a new automation object. This serves as our "Secret Shopper," simulating the customer by sending the initial email to the SAP AI Agent.
Figure 2 - Int4 Suite Automation Object for sending the email and validating the work of the AI Agent
Next, we update the object with the email content. This is where we define our "Soft Rules." The beauty of Int4 Suite is that these rules can be written in natural language. There is no coding required to tell the system what a "good" AI response should look like.
Figure 3 - Soft Validation rules within Int4 Suite
Finally, we configure the DB validation. This is the most critical step. We don't want an AI agent pretending it finished a task when the database tells a different story. We set up ABAP rules to check the newly created order directly inside the S/4HANA database.
Figure 4 - Database validation rules in SAP S/4HANA
When we trigger the validation from Int4 Suite, we get a comprehensive look at the agent's performance.
In our test run, the Soft Rules passed. The AI Agent replied politely and confirmed the order was placed. However, we aren't done yet.
Figure 5 - Soft validation rules correctness
The Final Validation compares the new Sales Order against a "Golden" reference document created with the same input. Only when every field in the S/4HANA database matches the reference exactly can we mark the test as a success.
Figure 6 - SAP S/4HANA Database validation result - comparison with the reference document
AI models are not static. LLM providers update their models, and "temperature" settings can lead to different outputs for the same input. By using Int4 Suite, you can run these tests daily to ensure that a change in an underlying LLM doesn't silently shift your business processes.
Video showing how Int4 Suite tests the SAP Joule Agent
The aim here is not to question the reliability of SAP AI, but to ensure that changes in the evolving LLM landscape never interfere with the stability and consistency of your production environment.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
| User | Count |
|---|---|
| 8 | |
| 6 | |
| 6 | |
| 6 | |
| 5 | |
| 4 | |
| 3 | |
| 3 | |
| 3 | |
| 3 |