cancel
Showing results for 
Search instead for 
Did you mean: 
Read only

Vector Engines Overview

MadhavSharma
Explorer
0 Likes
1,213

Well, this is not a question but my general understanding of Vector engine and my personal note of it.

Vector Engine

SAP previously used SAP HANA cloud as the database which was a multi-model database. Now with the release of vector engine (in sap Hana cloud) we can store the data in form of vector embeddings also.

Now also the database is multi-model. We have just added the vector engine to the database.

MadhavSharma_6-1717135584449.png

 

So, SAP already had these types of models like spatial data, graph type data, JSON etc. Now we have a vector embedding type of data also which is managed by the vector engine.

SAP does not only store these data or embeddings but also perform analytical computations with them to leverage fast query and provide contextual data to LLM’s.

MadhavSharma_7-1717135584461.png

 

We have a new data type to store these types of data namely -> REAL_VECTOR.

A constructor -> TO_REAL_VECTOR.

And two new distance calculating functions -> l2distance(), cosine_similarity().

L2Distance() calculates the Euclidean distance between the two vectors.

Cosine_similarity() is based on the angle the two vectors make with each other, More angle means more similarity.

These distance calculations are important as they help to calculate the relation between two vectors and how closely they match with each other.

 

We can directly query the vectors using normal SQL-like syntax from our application layer. Not just query but also perform in memory computations and also use the vectors for Retrieval Augmented Generation(RAG) to get the contextual data from the LLS’s used.

 

What is vector embeddings?

These are Mathematical data representations of any objects in a multidimensional vector space. These vector embeddings are generated by AI which are good with certain data types like maybe pdf or word.

It converts the unstructured data to a format that can be processed quickly.

Now that we have vector embeddings, these can be used to find the matching record using the distance in the vector space. All the related docs are kept at one place so the distance is less.

 

MadhavSharma_8-1717135584492.png

 

 

From this figure we can understand the whole use case.

We have uploaded some pdf files to the cloud database. The DB will create the chunks of texts and convert it into vector embeddings using AI and store it in vector space in the vector engine in HANA DB. Now this data is accessible throughout the DB.

A user comes and queries for the word “DATABASE”, this query is now converted to a vector itself and with the use of semantic search it queries the DB with the vector of our string. We might not have the exact word in our DB. Now this query is processed through the LLM model for giving the answer, as we don’t have database string in our DB. It will look for the closest thing possible logically based on the distance measurement technique. Therefore, fish is not considered but the string ‘TABLE’ is given as output.

Hence we have utilized the vector engine in order to send contextual data regarding our own DB data to the LLM to give accurate results with least hallucinations. This is the concept of Retrieval augmented generation also(RAG) which uses third-party data source apart from the dataset used to train the LLM.

Accepted Solutions (0)

Answers (1)

Answers (1)

Kangkana
Product and Topic Expert
Product and Topic Expert
0 Likes

Please include such articles in the Blog post section .
As otherwise this feeds take part into unanswered question reporting