Monday - last edited Monday
Welcome to Week 3! We're halfway through the challenge and the pipeline is taking shape. You can produce events, get them to a broker, and consume them from code/an integration platform. Now we get to the part that ties this challenge to AI: vectorization.
Links:
- May's developer challenge blog post: https://community.sap.com/t5/integration-blog-posts/may-2026-developer-challenge-from-events-to-inte...
- Week 1: Getting familiar with the events
- Week 2: Connecting to the broker and consuming events
- Week 3: Vectorizing the event payload
This week, we take the data field from the Business Partner event payload and convert it into a vector embedding — a numerical representation of the content that captures its semantic meaning. This is the step that will later allow us to do similarity searches and power a RAG application.
An embedding is a list of floating-point numbers — a vector — that represents the meaning of a piece of text in a high-dimensional space. Text with similar meaning ends up close together in that space. This is what makes semantic search possible: instead of matching exact keywords, you match meaning.
To generate embeddings, you need an embedding model. You pass in a piece of text, and it returns a vector. For our purposes, we'll be embedding the content of the data field of our Business Partner events — typically after converting the JSON object to a string or extracting the most relevant fields.
Embedding models in SAP AI Core
For example, from this event payload:
{
"BusinessPartner": "1003783",
"BusinessPartnerUUID": "456872b9-b9a2-4b93-894d-dff37abd3070",
"BusinessPartnerFullName": "Daniela-Anita Macedo",
"BusinessPartnerCategory": "1",
"BusinessPartnerGrouping": "BP02",
"FirstName": "Daniela-Anita",
"LastName": "Macedo",
"IsNaturalPerson": "X",
"CreationDate": "/Date(1518393600000)/",
"CreatedByUser": "CC0000000002",
"BusinessPartnerAddress": {
"Country": "PT",
"Region": "",
"CityName": "Quarteira",
"PostalCode": "1385-831",
"StreetName": "Travessa de Sousa, 6",
"HouseNumber": "681",
"AddressTimeZone": "WEST"
}
}
You might produce a string like:
BusinessPartner: 1003783. Name: Daniela-Anita Macedo. Category: 1. CityName: Quarteira.And that string is what you send to the embedding model.
👉 Extend your consumer from Week 2 so that, after receiving a Business Partner event, it generates a vector embedding of the event's data field.
Steps:
data field and prepare it as a stringEmbedding model options:
text-embedding-3-small_autogenerated or similar are available depending on your setupIf you want to run everything locally without any API keys, HuggingFace Sentence Transformers is an excellent option. A model like
all-MiniLM-L6-v2is small, fast, and produces 384-dimensional embeddings that are more than sufficient for this challenge.
Add a comment in this discussion with:
SAP solution note — I will share how I solved this using SAP AI Core (Generative AI Hub) in the comments below
Some food for thought:
yesterday
Week 3 Submission
(Week 1, we got the events routing into Solace from SAP, and in Week 2, we successfully consumed them.)
This week (Week 3), I have enhanced the Python consumer to push those events over to the HANA Vector DB. using locally running Embedding Model - Ollama - nomic-embed-text to HANA Cloud DB
Python Output
Vectorized Data on HANA Cloud
I just went ahead and consumed Vecorized Data in Claude using MCP.