Monday - last edited Monday
Welcome to Week 3! We're halfway through the challenge and the pipeline is taking shape. You can produce events, get them to a broker, and consume them from code/an integration platform. Now we get to the part that ties this challenge to AI: vectorization.
Links:
- May's developer challenge blog post: https://community.sap.com/t5/integration-blog-posts/may-2026-developer-challenge-from-events-to-inte...
- Week 1: Getting familiar with the events
- Week 2: Connecting to the broker and consuming events
- Week 3: Vectorizing the event payload
This week, we take the data field from the Business Partner event payload and convert it into a vector embedding — a numerical representation of the content that captures its semantic meaning. This is the step that will later allow us to do similarity searches and power a RAG application.
An embedding is a list of floating-point numbers — a vector — that represents the meaning of a piece of text in a high-dimensional space. Text with similar meaning ends up close together in that space. This is what makes semantic search possible: instead of matching exact keywords, you match meaning.
To generate embeddings, you need an embedding model. You pass in a piece of text, and it returns a vector. For our purposes, we'll be embedding the content of the data field of our Business Partner events — typically after converting the JSON object to a string or extracting the most relevant fields.
Embedding models in SAP AI Core
For example, from this event payload:
{
"BusinessPartner": "1003783",
"BusinessPartnerUUID": "456872b9-b9a2-4b93-894d-dff37abd3070",
"BusinessPartnerFullName": "Daniela-Anita Macedo",
"BusinessPartnerCategory": "1",
"BusinessPartnerGrouping": "BP02",
"FirstName": "Daniela-Anita",
"LastName": "Macedo",
"IsNaturalPerson": "X",
"CreationDate": "/Date(1518393600000)/",
"CreatedByUser": "CC0000000002",
"BusinessPartnerAddress": {
"Country": "PT",
"Region": "",
"CityName": "Quarteira",
"PostalCode": "1385-831",
"StreetName": "Travessa de Sousa, 6",
"HouseNumber": "681",
"AddressTimeZone": "WEST"
}
}
You might produce a string like:
BusinessPartner: 1003783. Name: Daniela-Anita Macedo. Category: 1. CityName: Quarteira.And that string is what you send to the embedding model.
👉 Extend your consumer from Week 2 so that, after receiving a Business Partner event, it generates a vector embedding of the event's data field.
Steps:
data field and prepare it as a stringEmbedding model options:
text-embedding-3-small_autogenerated or similar are available depending on your setupIf you want to run everything locally without any API keys, HuggingFace Sentence Transformers is an excellent option. A model like
all-MiniLM-L6-v2is small, fast, and produces 384-dimensional embeddings that are more than sufficient for this challenge.
Add a comment in this discussion with:
SAP solution note — I will share how I solved this using SAP AI Core (Generative AI Hub) in the comments below
Some food for thought:
yesterday
Week 3 Submission
(Week 1, we got the events routing into Solace from SAP, and in Week 2, we successfully consumed them.)
This week (Week 3), I have enhanced the Python consumer to push those events over to the HANA Vector DB. using locally running Embedding Model - Ollama - nomic-embed-text to HANA Cloud DB
Python Output
Vectorized Data on HANA Cloud
I just went ahead and consumed Vecorized Data in Claude using MCP.
2 hours ago
Hello,
Thanks for the challenge
As suggested I tried doing it locally using SAP BAS trial, but the storage was an issue so i'm doing it with COHERE Api, as it was available for free to consume.
Below are the steps performed.
1. received the event
2. Converted the JSON into String
3. sent it to an embedding model(cohere api)
4. got back the vector and logged it into data store
an hour ago
Week 3 with SAP AI CORE:
Step 1:
Step 2: