
We are excited to introduce the latest update to our SAP AI capabilities and add a new feature to our SAP Business. While we have been working on SAP Business AI and the Joule copilot service, there is always a question: Can we have more 😊? Well, this is a fair ask, and this is where SAP has been focusing.
Here we come with the new capability of RAG via the Document Grounding service, which can be activated across your LOB services where Joule is supported. I will touch on each of these topics and discuss how to set it up.
What is Document Grounding?
While Large Language Models (LLMs) excel at understanding and generating human-like text, they often lack the specificity and context of business data. To unleash the potential for LLMs, in particular, in conversational use cases, providing the right context to ground LLMs in business reality is essential to make the generated answers relevant (factually correct), reliable (based on up-to-date information), and responsible (trace factual errors to the corresponding data source). We use the Retrieval-Augmented Generation (RAG) technique to optimize the output of a large language model with information that is use-case-specific and not available as a training data source for LLMs.
AI-assisted document grounding lets SAP customers use their own specific unstructured or semi-structured documents to provide answers to user questions in Joule. Customers can connect their document repositories via a Microsoft SharePoint site.
Let us look at the details.
Pro Tip: Always refer to the official SAP Help Page for the latest updates on the setup process.
This blog post is a series of Joule setup guides. In case you are following this, please refer to my previous blogs:
- SAP BusinessAI – Overview for all !!!
- [SAP BTP Onboarding Series] Joule – Getting Started with Joule and SAP SuccessFactors
- [SAP BTP Onboarding Series] Joule with SFSF – Common Setup Issues
- Joule for SAP S/4HANA Cloud Private Edition - A Comprehensive Setup Guide
Pre-requisites:
A quick check on Document Grounding Support
Commercials & Pricing:
This link provides information on how Document Grounding is charged and the Estimator Tool. For additional terms and conditions, refer to the AI Services and AI Unit supplemental terms and conditions.
Technical Architecture
Image 0
Setup Process (as per LOB):
Let us take a look at the Activation Process for Document Grounding :
6. Prepare SharePoint Integration (optional)
6.1 Create SharePoint Site (you can re-use an existing site if you have one, and grant access to specific folders)
6.2 Create a Group and a Technical User (existing can be reused but should not have “2FA for this user”)
6.3 Register an Application, Generate a Client Secret, & Expose the application using web API
6.4 Validate the SharePoint access with the Technical User
7. Activate Grounding Service in BTP Subaccount & Create a Service Key
7.1 Configure User Authentication
8. Run a cURL command for POST and GET using Bruno API Client
9. Joule Testing
================================================================================
We continue from step 6, assuming that the Joule service is activated and you have followed the previous blogs.
To run this setup, you will need the following Values: Please capture the details as we proceed through the setup process.
================================================================================
6. Prepare SharePoint Integration (optional)
We shall look at the details that are required from Microsoft SharePoint. To work on the following setup, you need to be a SharePoint Administrator and have access to Identity Services Admin. In my case, I am using the default Microsoft’s Entra ID cloud-based identity and access management platform.
As a first step, log in to your Microsoft Entra, click on Overview, and copy the Tenant ID. We will be using this value later.
Image 1
6.1 Create a SharePoint Site
I guess this is one of the simple tasks, you can either use the existing Site or Create a New Site. Log in to your MS 365 SharePoint, and Click on Create Site.
Image 2
Select a Template as per your choice, I have selected Standard Communication.
Image 3
Click on Use Template.
Image 4
Give a Site Name that can help you recognize the Site for your Business/Scenario. I have entered the following details:
Site Name: Document Grounding
Site Description: Document Grounding for Joule
The site address is auto-generated, once you check the details, click on Next.
Image 5
Select the Language as English (Note only English is supported at this time), and Click on Create Site.
Image 6
Once the site is created, please navigate to Documents and upload the PDFs or Microsoft Word files.
Image 7
6.2 Create a Group and a Technical User
As part of our Document Grounding process, we will need a Technical User who will be used as a general login to create the pipelines and share the documents with our Services and this User should be in a Group.
We shall create a new Group and then a Technical User. Note that if you have an existing Group and it can be reused, you may skip creating the new Group.
Login to Microsoft Entra, Expand Groups -> Click on All Groups -> and then click on New Group.
Image 8
Provide a Group Name, select Microsoft Entra Roles that can be assigned to the group, and click on Create.
Group Name: Document Grounding for Joule.
Image 9
Now let us add an admin user to the groups with authorizations. To do that, expand the Users section, click on All Users, and click Create New User.
Image 10
Enter the User principal name, and Display Name, copy the values, and then click on Next.
User ID for Data Pipeline: documentgrounding@abc.onmicrosoft.com
Password: ABCDEFGH
Tip: Make a note of all the details in a notepad, we will need the details later.
Image 11
In the Assignments section, please select the Group “Document Grounding for Joule” that we created and click on Next: Review + Create.
Image 12
Please validate the details and Click on Create.
Image 13
6.3 Register an Application, Generate a Client Secret, & Expose the application using web API
In this step, we are going to register the application, which will help us define access to applications using Client ID and Client Secret.
Expand the section Applications, click on App registration, and click on New Registration.
Image 14
Enter the Name and select the option “Accounts in this organization directory only” (you may select other options based on your setup) and click on Register.
Tip: Make a note of all the details in a notepad, we may need the details later.
Image 15
Once the application is created, copy the Application (client) ID. We will use it later to create the BTP Destination.
Image 16
Now, let us use API Permission to generate a Client ID and Secret. Select the Registered app from the previous step, “document-grounding-joule-app,” click on API Permissions, Click on Add a Permission, and under Microsoft APIs, click on Microsoft Graph.
Image 17
Select the option Delegated permissions, and in the Search box – look for “sites” and in the search results, select the option Sites.Read.All and Click on Add permissions.
Image 18
If you see a pop-up Grant admin consent confirmation, click on Yes. Next, click on Grant Admin consent for MSFT. You should be able to see the sites added as shown below.
Image 19
Now that the application has been granted permission, we shall generate a Client ID and Secret for it. Within the “DocumentGrounding Joule” application, click on Certificates & Secrets, click on Client Secrets, enter a Description, and click on Add.
Description: BTP Destination for Document Grounding (on behalf of User)
Expires: select the maximum
Tip: Please ensure you track the Dates. After the expiry, we will need to recreate/renew and update the new value in the BTP Subaccount.
Image 20
Once the Client secrets are created, please copy the Value, as we will use it in BTP Destinations.
Image 21
6.4 Validate the SharePoint access with the Technical User
As a final step, we need to validate the Technical user access to the SharePoint that we have created. So, I will be using the values that I have.
Site Link: https://xxx.sharepoint.com/sites/DocumentGrounding
Entra Tech User ID for Data Pipeline: documentgrounding@abcd.onmicrosoft.com
Password: ABCDEFGH
Once you log in to this SharePoint Site with the user documentgrounding@abcd.onmicrosoft.com, you may be asked to create a new password. Please go ahead and create a new password and ensure that you remember it. You should now be automatically allowed to log in to the Site. In case of any authorization issues, please use login with your credentials, click on Site Access at the top right, and add the user as required. In my case, I have added the user with full Control.
Image 22
Once the user site access is granted, please validate the login with the new password. If the user has access to the site we are all good with the Site setup.
7. Activate Grounding Service in BTP Subaccount & Create a Service Key
7.1 Configure User Authentication
This is the process of activating the Grounding service in your BTP Subaccount and creating the required service key.
a. Activate Grounding Service in BTP Subaccount & Create a Service Key
This is the first step in activating your Document Grounding service. Please note that this requires Joule to be set up and working in your LOB.
If you have the AI Unit SKU 8018592, the service should be added to your SAP BTP Global Account. The service entitlement “Document Grounding” should be visible in the Entitlement -> Service Assignment section.
Assign this entitlement to your subaccount where you have configured the Joule for SuccessFactors system. Navigate to your subaccount, click on Entitlements -> Click on Edit -> Click on Add Service Plan -> search for Document Grounding, and select the plan “data-manager” and Save the settings.
Image 23
To activate the Document Grounding service, in your subaccount, click on Service Marketplace -> click on Document Grounding -> click on Create.
Image 24
The service for Document Grounding -> Plan is data-manager auto-selected. Select the disclaimer “I understand that enabling a service might result in costs, depending on the plan selected” and enter the Instance Name that can help you relate the service that you are activating. In my case, I have given “groundingcli”; please make a note of this, as we will be using it later. Enter the values and click on Create.
Image 25
In case the Document Grounding fails with the above Cloud Foundry runtime environment, delete the service that failed, and please select the option "Other" from the drop-down and create a new instance, as both services deliver the same functionality.
You should be able to see the instance once the service is created. Click on the instance, and in the ‘groundingcli’ screen -> in the Service Keys (if you have selected Runtime Environment as "Others" do this with "Service Binding") section -> click on Create.
Image 26
Enter the Service Key Name (Service Binding Name in case of "Others" runtime environment) -> In my case, I have entered the value “groundingkey” and Click on Create.
Image 27
Once the Service Key is created, click on the 3 dots and click on View.
Image 28
Copy the value displayed in the “groundingkey” URL. We need this value for our next steps.
Image 29
b. Create a Cloud Identity Service Instance and a Service Key
As part of the next step, we need to create a new subscription with our SAP Cloud Identity Services. Within the same Subaccount, navigate to Service Marketplace -> click on Cloud Identity Services -> click on Create and select the Plan as application, the Runtime Environment and Space should be selected automatically, if not please select the correct details and fill in the “Instance Name” -> “groundingCIS” in my case and click on Next.
Image 30
In case the Cloud Identity Services fails with the above Cloud Foundry runtime environment, delete the service that failed, and please select the option "Other" from the drop-down and create a new instance, as both services deliver the same functionality.
In the parameters page, please enter the following values, where <doc-grounding-instance-name> value is from Image 25 – “groundingcli” and click on Next.
{
"consumed-services":[
{
"service-instance-name":"<doc-grounding-instance-name>"
}
]
}
Image 31
Review the details and click on Create.
Image 32
Once the service is created, we need to create a Service Key using the Cloud Identity Service Instance that we created now. Click on the right arrow of groundingCIS to create a new Service Key.
Image 33
Enter the Service Key Name (Service Binding Name in case of "Others" runtime environment) as shown below and, please specify the JSON value as mentioned below.
{
"credential-type": "X509_GENERATED",
"validity": 365,
"validity-type": "DAYS"
}
Image 34
Click on Create once the values are entered. Now click on the Right Arrow of groundingCIS or click on the 3 dots and choose view the Service Key - cisSK file that we created now.
Image 35
Here, please ensure to copy the values of “clientid” and “authorization_endpoint” and then I recommend downloading this file.
Image 36
c. Copy / Edit the Certificate values to support *.crt and *.key values
Once you download the file cisSK.txt, please use an editor to open it. In my case, I have used Notepad++. Here please observe that we have 2 certificates where everything starts with -----BEGIN CERTIFICATE------ & one KEY File which starts with -------BEGIN RSA PRIVATE KEY----. You also see a \n (new line) character at multiple places, which needs a small cleanup activity. I recommend using Notepad++ for this cleanup activity or any editor that you have, or if you have “sed” commands (MacBook), you can use the commands to clean it up, follow the official help guide, or if you are using Notepad++ continue below.
Image 37
Let's work on the ----BEGIN CERTIFICATE ----- section. Copy the text until ----END CERTIFICATE---. If you observe closely, we should be able to see two certificates with Begin and End.
Once you have copied the values to the new text file, please do CRTL+F to Find and select Replace, enter the value in Find what – “\n”, enter the value for Replace with “,” (a comma) and ensure you have Wrap Around, Search Mode as Normal and then click on Replace All.
Image 38
Once you click on Replace All, the editor will replace \n with “,”(comma). Now, click on the Swap Find and Replace, under Search Mode select the option Extended, and then click on Replace All.
Image 39
The text will be formatted in a new line without the \n and “,”(commas), please review it.
Image 40
Please save this file as .crt. In my case, I have used “doc-grounding.crt”.
The same cleanup process needs to be followed for -------BEGIN RSA PRIVATE KEY----, please go back to your cisSK.txt file and copy only the values starting with -------BEGIN RSA PRIVATE KEY---- and -----END RSA PRIVATE KEY-----. Please follow the clean-up activity, as we did before, and save this file as .key, in my case I have used “doc-grounding.key”.
Image 41
Once the certificate and key files are ready, please move them to a folder. This helps us to generate both the files, .crt and .key files required to run the cURL commands.
d. Create Destination in SAP BTP Subaccount
We need to create a destination in the subaccount to create access and enable connectivity to Microsoft SharePoint using the APIs and technical user that we created.
Within your SAP BTP Subaccount, expand on Connections -> click on Destination -> click on Create Destination and enter the following details:
Field | Value |
Name | <NAME_OF_DESTINATION> |
Type | HTTP |
URL | |
Proxy Type | Internet |
Authentication | OAuth2Password |
User | The technical user that you've created in Microsoft Entra ID (Image 11) |
Password | The password that you've created in Microsoft Entra ID (Image 11) |
Client ID | Microsoft Entra ID Application credentials (Image 16) |
Client Secret | Microsoft Entra ID Application credentials (Image 21) |
Token Service URL | https://login.microsoftonline.com/<TENANT_ID>/oauth2/v2.0/token, where <TENANT_ID> is the token service URL of Microsoft Entra ID (Image 1) |
In the additional properties, click on Add, enter the following values:
scope : https://graph.microsoft.com/.default
The details should be as shown below; once you enter the details, please Click on Save.
Image 42
You may click on Check Connection for a quick test, and if all the setup is fine we see the result below.
Image 43
8. Run a cURL command for POST and GET using Bruno API Client
We run these cURL commands to get an access token to our Grounding pipelines and set up an authentication.
Note: If you are using a GitHub Bash or Mac System, you can run them directly in the terminal. In the case of Windows, you can either try with Postman Client or Bruno.
In this demo, I am going to use Bruno – an Opensource API Client.
8.1 Add Certificates to DocumentGrounding Collection
Open your Bruno client, click on the 3 dots, and click on Collections -> Create Collection. We are going to create a collection to get bearer tokens, display pipelines, create pipelines, and delete pipelines.
Image 44
Give your collection a name and save it in a path of your choice. Once the collection is created, mouse over it, Click on the three dots, and choose Settings -> click on Client Certificates.
Here, you will need two URLs: one to help us generate a Bearer Token and another to generate Document Grounding pipelines.
Enter the Domain value with one of the URLs, add the .crt and .key files, click on add, and repeat the same for the other URL and add it. The process is as shown below:
Image 45
8.2 Get Bearer Token
Mouse over to the DocumentGrounding collection, click on 3 dots, and select New Request. Here we need the URL to generate the Bearer Token, so the URL should be a combination of the following:
Type: HTTP
Name: Get Bearer Token
URL: https://<<CloudIdentityService>>/oauth2/token
The URL example:
https://abcdefgh.accounts.ondemand.com/oauth2/token
Image 46
Once you enter the values, click on Create. Now, in the entry, enter the following values in the values in Params and Headers, Save it, and Run the request.
Params:
client_id: ClientIDValue_from_GroundingCIS (Image 36)
grant_type: client_credentials
Headers:
content-type: application/x-www-form-urlencoded
accept: application/json
You should be able to see the Bearer Token as shown below.
Note: The URL will be updated once you add the Params value in Bruno.
Image 47
Make a note of this Access Token value, which will be used for the next step.
8.3 Get All Pipelines
We are going to create a new GET request to check if the Bearer token is working and to check the pipeline values. If you run it for the first time, you should get a null value.
You can mouse over to the DocumentGrounding collection, click on the three dots, select New Request, select the following details, and click on Create.
Type: HTTP
Name: Get all Pipelines
URL Type: Get
URL Value: The MTLS URL (image 28)
Eg: https://mtls.rage.c-6d4c6e4.kyma.ondemand.com/pipeline/api/v1/pipeline
Image 48
Once you enter the details click on Create. Now click on Headers, and add the following details.
Authorization: Bearer <<token value>>
accept: application/json
Save the value, and Run the request. You should be able to see a null value as shown below.
Image 49
This confirms the pipeline is reachable.
8.4 Set Up Content Ingestion – Create a Pipeline
Mouse over to DocumentGrounding, click on the three dots, and select New Request. Enter the following details:
Name: Create a Pipeline
URL Type: POST
URL Value: The MTLS URL (image 28)
Eg: https://mtls.rage.c-6d4c6e4.kyma.ondemand.com/pipeline/api/v1/pipeline
Image 50
Once you enter the details click on Create. In the new entry, please enter the Body and Header details as shown below.
Navigate to Body, create a new entry type JSON, and enter the following details:
{
"type": "MSSharePoint",
"configuration": {
"destination": "<<BTP_DocumentGrounding_Destination_Name>>",
"sharePoint": {
"site": {
"name": "<<sharepoint_name>>"
}
}
}
}
The value should be like this.
Image 51
Note: By default, the data pipeline will read data from the MS SharePoint site name you provide. If you want to give access to a specific folder of your SharePoint, you can use the following folder path. You can refer to the official help page here, I have added a sample scenario below.
Note: You should have the "/" along with the folder name. Any changes to the format, the pipeline may get created, but it may be empty and Joule may not answer your questions. Please verify using the Get all Pipelines to validate.
Image 52
Now click on Headers and enter the following values.
Authorization: Bearer <<Token_Value>>
content-type: application/json
Save the settings and click on run. You should see a New Pipeline generated like this.
Image 53
You can go back to Get a Pipeline and check if you can find the same pipeline ID along with the defined path.
8.5 Delete a pipeline
Note: Executing the delete option is not required unless you want to delete the full pipeline. This will clear the entire SharePoint and the Folders that you have selected.
Select the type as Delete for deletion, you can enter the details for Headers as shown below, maintain the MTLS URL, and enter the Pipeline ID you want to delete. Once you run this, the pipeline will be deleted.
Image 53a
9. Joule Testing
Once you complete your setup, please note that only the initial load of uploaded PDFs will be read immediately. Any documents added later will have to wait for the standard “scheduler/scheduled” time in the SAP system to refresh the data pipelines for FAQ support.
Here are some of the initial test results:
Image 54
Image 55
Image 56
Image 57
Happy Learning!!!
For support issues on the setup process, please create a ticket using the CA-ML-RAGE.
If you need any support during the setup you can reach us at SAP_AI_RIG@sap.com.
Special thanks to the entire Document Grounding Team and AI Team for contributing to this blog.
Regards,
Nagesh
Common Setup Issues:
This could happen for various reasons, and one of the reasons is documented below. In a few cases, we have observed that your Tenant OpenID Connect Configuration is using the legacy setting without the value “https://” in the URL. To find more information, you can visit the link here, or follow the steps below:
Before making the changes, please ensure to read the details. This may impact the other system setup due to URL changes. Please ensure to validate before making the changes as the changes are not reversable. You can also refer to 3191108 - There was an error when authenticating against the external identity provider: Invalid iss...
Go to your SAP Cloud Identity Services ->Click on Applications & Resources -> Under Single Sign On -> choose the OpenID Connect Configuration list item -> in the Issuer the URL value should be “https://<Cloud Identity Service URL>”, in case the https:// is missing, please click on the drop-down and select the same value with https:// (if it’s *.ondemand.com please same the value with *.ondemand.com) and save the settings.
Once you save this value, please run the Get CURL command and try it.
2. You have a message: “code":500,"message":"Microsoft Graph API error: Request failed with status code 401”
This could happen if you have missed the setup on “Granting admin consent for MSFT”, please ensure you go back to the User that you have created, API Permissions, and click on Grant admin consent for MSFT and ensure you see the status as shown below.
Appendix (only for MacBook Terminal)
If you plan to use a MacBook and run using the terminal:
Get Bearer Token:
curl \
--request POST \
--url <adjusted_authorization_endpoint> \
--header 'accept: application/json' \
--header 'content-type: application/x-www-form-urlencoded' \
--data 'client_id=<clientid>' \
--data 'grant_type=client_credentials' \
--cert <file_with_certificate> \
--key <file_with_key>
Where,
Placeholder | Description |
<adjusted_authorization_endpoint> | The "authorization_endpoint" |
<clientid> | The "clientid" value that you obtained from |
<file_with_certificate> | The doc-grounding.crt file, adjusted with the line breaks |
<file_with_key> | The doc-grounding.key file, adjusted with the line breaks |
curl --request POST --url https://aclxxsnax.accounts.ondemand.com/oauth2/token --header 'accept: application/json' --header 'content-type: application/x-www-form-urlencoded' --data 'client_id=452e75c9-ee1a-4964-xxxx-exx92be79690' --data 'grant_type=client_credentials' --cert doc-grounding.crt --key doc-grounding.key
Once you run this command, we receive a Bearer token which expires in 3600 seconds or 1 hr.
The response will be in a similar format as shown below:
{
"access_token":"eyJq........LI-L8KsOQV593dmtPU1g",
"token_type":"Bearer",
"expires_in":3600
}
Please copy the Bearer token as we will need it for our next steps.
Now, we shall continue with the pipeline with the GET command, to call the document grounding endpoints. Use the syntax:
curl \
--request GET \
--url '<url>/pipeline/api/v1/pipeline' \
--header 'accept: application/json' \
--header 'Authorization: Bearer <access_token>' \
--cert <file_with_certificate> \
--key <file_with_key>
Placeholder | Description |
<url> | The service key URL (MTLS) |
<file_with_certificate> | The doc-grounding.crt file adjusted with the line breaks |
<file_with_key> | The doc-grounding.key file adjusted with the line breaks |
In my case, I have the following format:
curl --request GET --url 'https://mtls.rage.a5601b3.kyma.ondemand.com/pipeline/api/v1/pipeline' --header 'accept: application/json' --header 'Authorization: Bearer eyJqa3UiO<<<short version of bearer token>>3heagq4mbLyYBg' --cert doc-grounding.crt --key doc-grounding.key
Once you run the command, you should receive a response with an empty list [] since the pipeline hasn’t been created yet.
Set up Content Ingestion
The last and final step is to set up a content ingestion which will help us to push the SharePoint documents to the Document Grounding pipelines which will further use the LLMs for training and support the end user queries while using Joule.
Use the following cURL command and remember if you get a 401 – unauthorized, it could the bearer token just got expired. Please use the Post command and run the following command with the bearer token.
Syntax:
curl \
--request POST \
--url '<url>/pipeline/api/v1/pipeline' \
--header 'Authorization: Bearer <access_token>' \
--header 'content-type: application/json' \
--data '{"type": "MSSharePoint","configuration": {"destination": "<NAME_OF_DESTINATION>","sharePoint": {"site": {"name": "<NAME_OF_SHAREPOINT_SITE>"}}}}' \
--cert <file_with_certificate> \
--key <file_with_key>
I have used the following command:
curl --request POST --url 'https://mtls.rage.a5601b3.kyma.ondemand.com/pipeline/api/v1/pipeline' --header 'Authorization: Bearer eyJqa3UiOiJod<<<Shortversion>>>xjoI4npWdgq4mbLyYBg' --header 'content-type: application/json' --data '{"type": "MSSharePoint","configuration": {"destination": "joule-sfsf-data-pipeline","sharePoint": {"site": {"name": "DocumentGrounding"}}}}' --cert doc-grounding.crt --key doc-grounding.key
Once you run the command successfully, you should be able to see the response as shown in the image.
Response: { "pipelineId":"4cfd0478-29ea-45c2-bc40-d3817621744e" }
This confirms that the setup has been completed using the terminal approach, and we are now ready to query the Joule service with the Documents that are uploaded in SharePoint.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
User | Count |
---|---|
14 | |
11 | |
8 | |
6 | |
6 | |
6 | |
5 | |
5 | |
5 | |
5 |