Artificial Intelligence Blogs Posts
cancel
Showing results forΒ 
Search instead forΒ 
Did you mean:Β 
Naresh_Abaka
Advisor
Advisor
1,041

Here’s a concise walkthrough for setting up and running a benchmarking evaluation using the Generative AI Hub service in SAP BTP:

1. Assign Role Collection for Generative AI Hub Access

  • To access the evaluation service under Generative AI Hub, assign the role collection ailaunchpad_genai_manager to the user of your BTP landscape.

Naresh_Abaka_0-1747110636784.png

  • Roll Collection in BTP

Naresh_Abaka_2-1747111063335.png

2. Set Up Object Store for Evaluation

  • For the benchmarking exercise, you need an object store.
  • In this example, we use AWS S3 as the object store type.

Naresh_Abaka_4-1747111545493.png

3. Create Object Store Secrets

  • Define two object store secrets in AICore:
    • One for input files (e.g., test datasets, orchestration config).
    • Another for evaluation results.
    • The secret for evaluation results must be named default.
    • The input file secret can be named according to your use case.
      Naresh_Abaka_5-1747111831019.png

       

      Naresh_Abaka_6-1747112006794.png
    • This maps to an S3 path such as:
      s3://hcp-5e9a9c35-3dd8-493a-bb55-7980c01fd279/td4
      And can be accessed using the alias:
      ai://err-res-genai-data.

4. Connect Local System to AWS S3 Object Store

5. Upload Datasets to Object Store

  • Refer to Step 10 of the same tutorial to upload datasets like test data, orchestration configs, and custom metrics from your local system to the configured AWS S3 bucket.

Naresh_Abaka_0-1747116319838.png

6. Register Artifacts in AICore

  • Register all relevant artifacts (test datasets, orchestration config, custom metric files) in AICore.

  • Use the root directory name eval-data to register the artifcats, the sub-directory and its files are automatically registered.

  • Navigate to the Other Artifacts section in the AI Launchpad dashboard and use β€œAdd” to register the path.

Naresh_Abaka_1-1747116620528.png

Naresh_Abaka_2-1747116734997.png

Example path:
ai://err-res-genai-data/eval-data

  • ai://err-res-genai-data – Refers to your object store secret key.

  • eval-data – Root directory where you will create subfolders like /runs and /testdata.

Naresh_Abaka_3-1747116763184.png

7. Create & Monitor Evaluation Run

  • After registering the artifacts, create your Evaluation via the AICore interface.

Naresh_Abaka_4-1747116920317.png

Naresh_Abaka_5-1747116944620.png

Naresh_Abaka_6-1747116988793.png

  • Monitor the execution workflow under the Executions tab to track progress and results.

Naresh_Abaka_7-1747117027905.png

  • The benchmarking metrics selected in Step 7 are evaluated based on the input artifacts and presented as follows.

Naresh_Abaka_8-1747117052806.png

 

 

3 Comments
Jannis94
Explorer
0 Kudos

Hi @Naresh_Abaka  thanks for the tutorial.

Can you provide any samples on the contents / structure of errres_run1.json and errres_dataset1.csv? I assume these have to follow certain patterns but i cant find official docs on that.

Also what kind of framework is ultimately used to evaluate, is it a proprietary SAP framework or some open-source framework under the hood?

Thanks a lot in advance.

Naresh_Abaka
Advisor
Advisor
Jannis94
Explorer
0 Kudos

@Naresh_Abaka thanks for the response, but can i access these repos as a non-SAP employee / are these public repos?