Technology Blogs by SAP
Learn how to extend and personalize SAP applications. Follow the SAP technology blog for insights into SAP BTP, ABAP, SAP Analytics Cloud, SAP HANA, and more.
cancel
Showing results for 
Search instead for 
Did you mean: 
nageshcaparthy
Product and Topic Expert
Product and Topic Expert
2,917

We are excited to introduce the latest update from our SAP AI capabilities and add a new feather to our SAP Business. While we have been working on SAP Business AI and a copilot Joule Service with SAP SuccessFactors, there is always a question, can we have FAQ capabilities with Joule, well this is where SAP has been focusing. We are happy to introduce the new capability of RAG via the Document Grounding service.  

What is Document Grounding?

While Large Language Models (LLMs) excel at understanding and generating human-like text, they often lack the specificity and context of business data. To unleash the potential for LLMs, in particular, in conversational use cases, providing the right context to ground LLMs in business reality is essential to make the generated answers relevant (factually correct), reliable (based on up-to-date information), and responsible (trace factual errors to the corresponding data source). We use the Retrieval-Augmented Generation (RAG) technique to optimize the output of a large language model with information that is use-case specific and not available as a training data source for LLMs.

AI-assisted document grounding lets SAP customers use their own specific unstructured or semi-structured documents to provide answers to user questions in Joule. Customers can connect their document repositories via a Microsoft SharePoint site.

Let us take a look at the details.

Pro Tip: Always refer to the official SAP Help Page for the latest updates on the setup process. 

Commercials & Pricing:

You can refer to this link for information on how Document Grounding is charged. Refer to the AI Services and AI Unit supplemental terms and conditions.

Pre-requisites:

Document Grounding Supports

  • Up to 2,000 documents support grounding
  • Document type: PDF and MS Word Document
  • Document Content: Plain Text (tables and images are not supported)
  • Language: English
  • Content Refresh: SharePoint content is updated once every day, meaning any changes to the PDF or Word files in the SharePoint location, the content is refreshed once every day and we do not have manual refresh at this time

Setup Process: (follow my previous blog for steps 1 to 5 -  Joule – Getting Started with Joule and SAP SuccessFactors)

  1. Setup your Subaccount with required Entitlements (Grounding Service will be visible in your BTP Global Account if you have the license for AI Units)
  2. Validate the Cloud Identity setup with your SFSF and use the same Cloud Identity to Establish Trust with your Subaccount
  3. Setup Joule using the booster and complete the work zone setup if you are using the full functionalities of the Joule Base version
  4. Use the SFSF Role-Based Permissions to grant Joule Access
  5. Complete the Job Sync in Cloud Identity

Let us take a look at the Activation Process for Document Grounding :

6. Prepare SharePoint Integration

  1. Create SharePoint Site (optional, you can re-use an existing site if you have one)
  2. Create a Group and a Technical User (optional, existing can be reused)
  3. Register an Application, Generate a Client Secret, & Expose the application using web API
  4. Validate the Sharepoint access with the Technical User

7. Activate Grounding Service in BTP Subaccount & Create a Service Key

  1. Configure User Authentication
  • Create a Cloud Identity Service Instance and a Service Key
  • Copy / Edit the Certificate values to support *.crt and *.key values
  • Run a cURL command for POST and GET

8. Create a Destination in the SAP BTP Subaccount
9. Set up Content Ingestion
10. Joule Testing

===========================================================

We continue from step 6, assuming that the Joule service is activated and you have followed the previous blogs.

 6. Prepare SharePoint Integration

We shall look at the details that are required from Microsoft SharePoint. To work on the following setup, you need to be a SharePoint Administrator and have access to Identity Services Admin. In my case, I am using the default Microsoft’s Entra ID cloud-based identity and access management platform.

As a first step, log in to your Microsoft Entra, click on Overview, and copy the Tenant ID. We will be using this value later.

Image 1.png

Image 1

6.1 Create a SharePoint Site

I guess this is one of the simple tasks, you can either use the existing Site or Create a New Site.  Log in to your MS 365 SharePoint, and Click on Create Site.

 Image 2.jpg

Image 2

Select a Template as per your choice, I have selected Standard Communication.

Image 3.jpg

Image 3

Click on Use Template.

Image 4.jpg

 Image 4

Give a Site Name that can help you recognize the Site for your SFSF Systems. I have entered the following details:

Site Name: SFSF RAGe + Document Grounding

Site Description: SFSF RAGe + Document Grounding Repo

The site address is auto-generated, once you check the details, click on Next.

Tip: Make a note of all the details in a notepad, we may need the details later.

 Image 5.jpg

 Image 5

Select the Language as English (Note only English is supported at this time), and Click on Create Site.

Image 6.jpg

 Image 6

Once the site is created, please navigate to Documents and upload the PDFs or Microsoft Word files.

Image 7.jpg

 Image 7

6.2 Create a Group and a Technical User

As part of our Document Grounding process, we will need a Technical User that will be used as a general login to create the pipelines and share the documents with our Services and this User should be in a Group.

We shall create a new Group and then a Technical User. Note if you have an existing Group, and if it can re-used, you may skip creating the Group.  

Login to Microsoft Entra, Expand Groups -> Click on All Groups -> and then click on New Group.

Image 8.jpg

 Image 8

Provide a Group Name, and select Microsoft Entra Roles can be assigned to the group and click on Create.

Group Name: SFSF RAGe + Document Grounding Group

Tip: Make a note of all the details in a notepad, we may need the details later.

Image 9.jpg

 Image 9

Now let us add an admin user to the groups with authorizations. To do that, expand the Users section, click on All Users, and click Create New User.

Image 10.jpg

Image 10

Enter the User principal name, and Display Name, copy the values, and then click on Next.

User ID for Data Pipeline: sfsf-joule-data-pipeline@05xf1.onmicrosoft.com

Password: ABCDEFGH

Tip: Make a note of all the details in a notepad, we may need the details later.

Image 11.jpg

 Image 11

In the Assignments section, please select the Group that we created “SFSF RAGe + Document Grounding Group” and click on Next: Review + Create.

Image 12.jpg

 Image 12

Please validate the details and Click on Create.

Image 13.jpg

Image 13

6.3 Register an Application, Generate a Client Secret, & Expose the application using web API

In this step, we are going to register the application that will help us to define access to applications with the help of Client ID and Client Secret.

Expand the section Applications, click on App registration, and click on New Registration.

Image 14.jpg

 Image 14

Enter the Name and select the option “Accounts in this organization directory only” (you may select other options based on your setup) and click on Register.

 Tip: Make a note of all the details in a notepad, we may need the details later.

Image 15.jpg

 Image 15

Once the application is created, copy the Application (client) ID, we will be using it later at BTP Destination creation.

Image 16.jpg

 Image 16

Now, let us go with API Permission to generate a Client ID and Secret. Select the Registered app from the previous step “sfsf-joule-data-pipeline” and click on API Permissions, and under Microsoft APIs, click on Microsoft Graph.

Image 17.jpg

 Image 17

Select the option Delegated permissions, and in the Search box – look for “sites” and in the search results select the option Sites.Read.All and Click on Add permissions.

Image 18.jpg

 

Image 18

If you see a pop-up Grant admin consent confirmation, click on Yes.

Image 19.jpg

 Image 19

Now that the application is granted with permission, we shall generate a Client ID and Secret for it. Within the “sfsf-joule-data-pipeline” application click on Certificates & Secrets, click on Client Secrets, enter a Description, and click on Add.

Description: BTP Destination for SFSF(on behalf of User)

Expires: select the maximum

Tip: Please ensure you track the Dates and after the expiry, we will need to recreate/renew and update the new value in the BTP Subaccount.

Image 20.jpg

 Image 20

Once the Client secrets are created, please copy the Value and the Secret ID as we will be using the Value in BTP Destinations.

Image 21.jpg

 Image 21

6.4 Validate the SharePoint access with the Technical User

As a final step, we need to validate the Technical user access to the Sharepoint that we have created. So, I will be using the values that I have.  

Site Link: https://05xxxxxf1.sharepoint.com/sites/SFSFRAGeDocumentGrounding

Entra Tech User ID for Data Pipeline: sfsf-joule-data-pipeline@05xxxxf1.onmicrosoft.com

Password: ABCDEFGH

Once you log in to this Sharepoint Site with the user sfsf-joule-data-pipeline@05xf1.onmicrosoft.com, you may be asked to create a new password, please go ahead and create a new password, ensure to remember the password. You should now be automatically allowed to log in to the Site. In case of any authorization issues, please use login with your credentials click on Site Access on the top right, and add the user as required. In my case, I have added the user with full Control.

Image 22.jpg

Image 22 

Once the user site access is granted, please validate the login with the new password. If the user has access to the site we are all good with the Site setup.

Activate Grounding Service in BTP Subaccount & Create a Service Key

7. Configure User Authentication

This is the process of activating the Grounding service in your BTP Subaccount and creating the required service key.

Tip: Please make a note of all the instance and service key names that we are creating at each step.

7.1 Activate Grounding Service in BTP Subaccount & Create a Service Key

This is the first step of activating your Document Grounding service. Please note that Joule must be set up/working in your SFSF Account.

The service should be added to your SAP BTP Global Account if you have the AI Unit SKU 8016532. The service entitlement “Document Grounding” should be visible in the Entitlement -> Service Assignment section.

Assign this entitlement to your subaccount where you have configured the Joule for SuccessFactors system. Navigate to your subaccount, click on Entitlements -> Click on Edit -> Click on Add Service Plan -> search for Document Grounding, and select the plan “data-manager” and Save the settings.

Image 23a.jpg

 Image 23a

To activate the Document Grounding service, in your subaccount, and click on Service Marketplace -> click on Document Grounding -> click on Create.

Image 23b.jpg

 Image 23b

The service for Document Grounding -> Plan is data-manager autoselected, and select the disclaimer “I understand that enabling a service might result in costs, depending on the plan selected” and enter the Instance Name that can help you relate the service that you are activating. In my case I have given “groundingcli”, please make a note of this as we will be using it later. Enter the values and click on Create.

Image 24.jpg

 Image 24

You should be able to see the instance once the service is created. Click on the instance and in the ‘groundingcli’ screen -> in the Service Keys section -> click on Create.

Image 25.jpg

 Image 25

Enter the Service Key Name -> In my case, I have entered the value “groundingkey” and Click on Create.

 Image 26.jpg

 Image 26

Once the Service Key is created, click on the 3 dots and click on View.

Image 27.jpg

 Image 27

Copy the value displayed, in the “groundingkey”, URL we need this value for our next steps.

Image 28.jpg

 Image 28

7.2 Create a Cloud Identity Service Instance and a Service Key

As part of the next step, we need to create a new subscription with our SAP Cloud Identity Services. Within the same Subaccount, navigate to Service Marketplace -> click on Cloud Identity Services -> click on Create and select the Plan as application, the Runtime Environment and Space should be selected automatically, if not please select the correct details and fill in the “Instance Name” -> “groundingCIS” in my case and click on Next.

Image 29.jpg

 Image 29

In the parameters page, please enter the following values, – “groundingcli” and click on Next

{
   "consumed-services":[
      {
         "service-instance-name":"<doc-grounding-instance-name>"
      }
   ]
}

Image 30.jpg

 Image 30

Review the details and click on Create.

Image 31.jpg

 Image 31

Once the service is created, we need to create a Service Key using the Cloud Identity Service Instance that we created now. Click on the right arrow of groundingCIS to create a new Service Key.

Image 32.jpg

 Image 32

Click on Create in the Service Key area, enter a Service Key Name, and copy the JSON parameters as mentioned below.

{
   "credential-type":"X509_GENERATED"
}

Image 33.jpg

 Image 33

Click on Create once the values are entered. Now click on the Right Arrow of groundingCIS or click on the 3 dots and choose view the Service Key - cisSK file that we created now.

Image 34.jpg

Image 34

Here, please ensure to copy the values of “clientid” and “authorization_endpoint” and then I recommend downloading this file.  

Image 35.jpg

 Image 35

7.3 Copy / Edit the Certificate values to support *.crt and *.key values

Once you download the file cisSK.txt, please use an editor to open it. In my case, I have used Notepad++. Here please observe that we have 3 certificates where everything starts with -----BEGIN CERTIFICATE------ & one KEY File which starts with -------BEGIN RSA PRIVATE KEY----. You also see a \n (new line) character at multiple places, which needs a small cleanup activity. I recommend using Notepad++ for this cleanup activity, or any editor that you have, or if you have “sed” commands, you can use the commands to clean it up, follow the official help guide or if you are using Notepad++ continue below.

Image 36.jpg

 Image 36

a. Let's work on the ----BEGIN CERTIFICATE ----- section, please copy the text until ----END CERTIFICATE---, if you observe closely we should be able to see 3 certificates with Begin and End.

Once you have copied the values to the new text file, please do CRTL+F to Find and select Replace, enter the value in Find what – “\n”, enter the value for Replace with “,” (a comma) and ensure you have Wrap Around, Search Mode as Normal and then click on Replace All.

Image 37.jpg

 Image 37

Once you click on Replace All, the editor will replace \n with “,”(comma). Now, click on the Swap Find and Replace, under Search Mode select the option Extended, and then click on Replace All.

Image 38.jpg

 Image 38

The text will be formatted in a new line without the \n and “,”(commas), please review it.

Image 39.jpg

 Image 39

Please save this file as .crt. In my case, I have used “doc-grounding.crt”.

b. The same cleanup process needs to be followed for -------BEGIN RSA PRIVATE KEY----, please go back to your cisSK.txt file and copy only the values starting with -------BEGIN RSA PRIVATE KEY---- and -----END RSA PRIVATE KEY-----. Please follow the clean-up activity, as we did before, and save this file as .key, in my case I have used “doc-grounding.key”.

Image 40.jpg

 Image 40

Once the certificate and key files are ready, please move them to a folder. This helps us to generate both the files .crt and .key files required to run the cURL commands.

7.4 Run a cURL command for POST and GET

We run these cURL commands to get an access token to our Grounding pipelines and set up an authentication. Please use the following syntax:

Note: if you are using a Mac System you can run them directly in the terminal, in the case of Windows you may require GitHub Bash or alternative command prompts that support cURL.

curl \
--request POST \
--url <adjusted_authorization_endpoint> \
--header 'accept: application/json' \
--header 'content-type: application/x-www-form-urlencoded' \
--data 'client_id=<clientid>' \
--data 'grant_type=client_credentials' \
--cert <file_with_certificate> \
--key <file_with_key>

Where,

Placeholder

Description

<adjusted_authorization_endpoint>

The "authorization_endpoint" 

<clientid>

The "clientid" 

<file_with_certificate>

The doc-grounding.crt file, adjusted with the line breaks

<file_with_key>

The doc-grounding.key file, adjusted with the line breaks

I am using a Windows system with Git Bash to run the cURL commands with the following format:

curl --request POST --url https://aclxxsnax.accounts.ondemand.com/oauth2/token --header 'accept: application/json' --header 'content-type: application/x-www-form-urlencoded' --data 'client_id=452e75c9-ee1a-4964-xxxx-exx92be79690' --data 'grant_type=client_credentials' --cert doc-grounding.crt --key doc-grounding.key

Once you run this command, we receive a Bearer token which expires in 3600 seconds or 1 hr.

The response will be in a similar format as shown below:

{
"access_token":"eyJq........LI-L8KsOQV593dmtPU1g",
"token_type":"Bearer",
"expires_in":3600
}

Image 41.jpg

 Image 41

Please copy the Bearer token as we will need it for our next steps.

Now, we shall continue with the pipeline with the GET command, to call the document grounding endpoints. Use the syntax:

curl \
--request GET \
--url '<url>/pipeline/api/v1/pipeline' \
--header 'accept: application/json' \
--header 'Authorization: Bearer <access_token>' \
--cert <file_with_certificate> \
--key <file_with_key>

 

Placeholder

Description

<url>

The service key URL value for document grounding that you obtained in step 8

<file_with_certificate>

The doc-grounding.crt file adjusted with the line breaks

<file_with_key>

The doc-grounding.key file adjusted with the line breaks

In my case, I have the following format:

curl --request GET --url 'https://mtls.rage.a5601b3.kyma.ondemand.com/pipeline/api/v1/pipeline' --header 'accept: application/json' --header 'Authorization: Bearer eyJqa3UiO<<<short version of bearer token>>3heagq4mbLyYBg' --cert doc-grounding.crt --key doc-grounding.key

Image 42.jpg

Image 42

Once you run the command, you should receive a response with an empty list [] since the pipeline hasn’t been created yet.

8. Create a Destination in the SAP BTP Subaccount

We need to create a destination in the subaccount to create access and enable connectivity to Microsoft SharePoint using the APIs and technical users that we created.

Within your SAP BTP Subaccount, expand on Connections -> click on Destination -> click on Create Destination and enter the following details:

Field

Value

Name

<NAME_OF_DESTINATION> 

Type

HTTP

URL

https://graph.microsoft.com

Proxy Type

Internet

Authentication

OAuth2Password

User

The technical user that you've created in Microsoft Entra ID 

Password

The password that you've created in Microsoft Entra ID

Client ID

Microsoft Entra ID Application credentials

Client Secret

Microsoft Entra ID Application credentials

Token Service URL

https://login.microsoftonline.com/<TENANT_ID>/oauth2/v2.0/token, where <TENANT_ID> is the token service URL of Microsoft  Entra ID

The details should be as shown below, once you enter the details please Click on Save

Image 43.jpg

 Image 43

You may click on Check Connection for a quick test, and if all the setup is fine we see the result below.

Image 44.jpg

 Image 44

9. Set up Content Ingestion

The last and final step is to set up a content ingestion which will help us to push the SharePoint documents to the Document Grounding pipelines which will further use the LLMs for training and support the end user queries while using Joule.

Use the following cURL command and remember if you get a 401 – unauthorized, it could the bearer token just got expired. Please use the Post command and run the following command with the bearer token.

Syntax:

curl \
--request POST \
--url '<url>/pipeline/api/v1/pipeline' \
--header 'Authorization: Bearer <access_token>' \
--header 'content-type: application/json' \
--data '{"type": "MSSharePoint","configuration": {"destination": "<NAME_OF_DESTINATION>","sharePoint": {"site": {"name": "<NAME_OF_SHAREPOINT_SITE>"}}}}' \
--cert <file_with_certificate> \
--key <file_with_key>

I have used the following command:

curl --request POST --url 'https://mtls.rage.a5601b3.kyma.ondemand.com/pipeline/api/v1/pipeline' --header 'Authorization: Bearer eyJqa3UiOiJod<<<Shortversion>>>xjoI4npWdgq4mbLyYBg' --header 'content-type: application/json' --data '{"type": "MSSharePoint","configuration": {"destination": "joule-sfsf-data-pipeline","sharePoint": {"site": {"name": "SFSFRAGeDocumentGrounding"}}}}' --cert doc-grounding.crt --key doc-grounding.key

Once you run the command successfully, you should be able to see the response as shown in the image.

Response: {    "pipelineId":"4cfd0478-29ea-45c2-bc40-d3817621744e" }

Image 45.jpg

 Image 45

This confirms that the setup has been completed are we are not ready to query the Joule service with the Documents that are uploaded in SharePoint.

10. Joule Testing

Once you complete your setup, please note only the initial load of PDFs that are uploaded will be read immediately. Any documents that are added later, will have to wait for the standard “scheduler/scheduled” time in the SAP system to refresh the data pipelines for FAQ support.

Here are some of the initial test results working with SAP SuccessFactors:

  •  Looking for Co-Pay information on Medical Insurance:

Image 46.jpg

Image 46

  • Checking eligibility for any medical surgery:

Image 47.jpg

 Image 47

  • Checking escalation process

Image 48.jpg

 Image 48

  • Verify the data from the uploaded Word File.

Image 49.jpg

 Image 49

Happy Learning!!! 

For support issues on the setup process, please create a ticket using the CA-ML-RAGE

If you need any support during the setup you can reach us at SAP_BTP_Onboarding@sap.com.

Special thanks to the entire Document Grounding Team and AI Team for contributing to this blog. 

Regards,

Nagesh

  • SAP Managed Tags:
5 Comments