Technology Blog Posts by SAP
Learn how to extend and personalize SAP applications. Follow the SAP technology blog for insights into SAP BTP, ABAP, SAP Analytics Cloud, SAP HANA, and more.
cancel
Showing results for 
Search instead for 
Did you mean: 
MartinKolb
Product and Topic Expert
Product and Topic Expert
12,513

VectorAndLangChain.jpeg

The popular LangChain framework makes it easy to build powerful AI applications. For many of these scenarios, it is essential to use a high-performance vector store.
With HANA Vector Engine, the enterprise-grade HANA database, which in known for its outstanding performance, enters the field of vector stores. With the LangChain integration for HANA Vector Engine, it is now easier than ever to build highly scalable AI applications.
This blog will guide you through 6 easy steps which show how to build a chat-based application based using RAG (Retrieval Augmented Generation) techniques together with HANA Vector Engine in LangChain.

 

Building a Demo Application

The demo application shall provide the possibility for end-users to ask questions about technical information that is spread across many pages of a website. As an example, we use the content of the CAP (Cloud Application Programming) technology of SAP (https://cap.cloud.sap).

Step 1: Loading the Content of the Website for Further Processing

LangChain provides a very convenient way to load the content of a website into “Document” objects of LangChain. The main prerequisite is that the website has a “SiteMap” defined. Most professional websites do have a sitemap, and so does the CAP documentation. Thus, loading all pages of the website is just a matter of one line, using LangChain’s “SitemapLoader”:

documents = SitemapLoader("https://cap.cloud.sap/docs/sitemap.xml").load()

 

Step 2: Splitting the Loaded Documents

Search results via vector embeddings tend to degrade if the embedded text is too large. How to determine the ideal sizes is beyond the scope of this blog post. In general, it is a best practice to split the texts into small parts (e.g. 2000 characters). Again, with LangChain, this is a matter of calling “split_documents” on an instance of a “TextSplitter”. In our example we use the “RecursiveCharacterTextSplitter”:

text_splitter = RecursiveCharacterTextSplitter(chunk_size=2000)
splits = text_splitter.split_documents(documents)

 

Step 3: Creating Embedding Vectors and Loading them into HANA

For each of the split document parts, a vector embedding shall be created and then this embedding is inserted into HANA, along with the text. The vectors are not created by HANA, but we pass a reference to an embedding model, which is called to create the vectors. In our example we use an embedding model from OpenAI via the LangChain interface of SAP's "Generative AI Hub". This process is triggered by calling the standard vector-store interface “from_documents”, which is also available for HANA:

vectordb = HanaDB.from_documents(connection=connection,
    documents=splits,
    embedding=gen_ai_hub.proxy.langchain.OpenAIEmbeddings(),
    table_name="CAP_EMBEDDINGS"
)

The “connection” parameter is a connection to a HANA instance that was created by the standard HANA client library “hdbcli”. The “table_name” refers to the relational table which is used to store the vectors and the texts. Other vector-stores use the term “collection” for storing vector data. As HANA is a relational database, it stores vector data in a table, where one of the columns is used for storing the embedding vectors.
The call to “from_documents” first creates embedding vectors for all split document parts by calling the embedding model. Then the data is inserted to HANA. Depending on the amount of text, such a call may take several minutes, but almost all of the time is spent in the embedding model. Storing the vectors in HANA is a very fast operation.

Step 4: Defining a Prompt For a Large Language Model

The definition of a prompt is simply a text with some “variables”. LangChain will fill in values for these variables and pass that value to the LLM for processing:

prompt_template = '''
You are an expert of the SAP Cloud Programming model. You are provided multiple context items that are related to the question to answer.
Use the following pieces of context to answer the question at the end.
```
{context}
```
Question: {question}
'''
PROMPT = langchain.prompts.PromptTemplate(
    template=prompt_template, input_variables=["context", "question"]
)

The variable “context” will contain the best-matching text parts from the vector search based on the given question. And obviously, the variable “question” will contain the question that shall be answered.

Step 5: Create a RetrievalChain using an LLM

This step puts all the pieces together that were created in the previous steps. We use GPT4 as LLM, which is easily accessible via its SAP Generative AI Hub's LangChain integration. The main magic is done in LangChain’s implementation of “ConversationalRetrievalChain”:

qa_chain = ConversationalRetrievalChain.from_llm(
    llm=gen_ai_hub.proxy.langchain.ChatOpenAI(proxy_model_name="gpt-4"),
    vectordb.as_retriever(search_kwargs={'k': 20}),
    combine_docs_chain_kwargs={'prompt': PROMPT})

In addition to the LLM, the chain uses the “vectordb” instance that was created in step 3. By calling “as_retriever”, we instruct LangChain to retrieve the 20 (parameter “k”) most similar document splits when performing a similarity search of the entered question on the split documents. The texts found via the similarity search will then be added as “context” in the prompt.

Step 6: Ask a Question About the Content in the Vectors

Finally, we can ask questions (typically entered by end-users) about the content that we have vectorized. We pass in the question, and after some time (again, it’s the LLM that uses the time, not the super-fast HANA Vector Engine) we get back the answer as text that we can display to the user:

answer = qa_chain.invoke({"question": "How can I use CORS in CAP?" , "chat_history": []})

 

That’s it 😀. 6 simple steps to create a RAG based AI application with HANA Vector Engine.

 

Conclusion and Outlook

The LangChain integration of HANA Vector Engine combines the widely used LangChain framework with the power and speed of the enterprise-grade HANA database. The current LangChain integration of HANA Vector Engine comprises the Python version of LangChain. Further enhancements and more integration options are already on their way. Stay tuned for further announcements and blog posts.

 

Related Information

9 Comments
Phil_from_Madrid
Participant

Hi Martin, great blog. Can't wait to test it myself. The link to the Vector Enginec Documentation seems to be broken. Can you confirm the link? Kind regards 

MartinKolb
Product and Topic Expert
Product and Topic Expert
0 Kudos

Hi @Phil_from_Madrid , thanks for the reply 👍. And thanks for pointing out the issue with the link. The link is actually correct, but the content is made publicly available at the release day of the "QRC 1/2024" release of HANA Cloud. I added a comment to make users aware of this. But you can (and should 😉) add the link to your bookmarks/favorites already now.

martagolabek
Newcomer

This implementation unfortunately doesn't work for me. "chat_history" parameter seems to be required in the `invoke` method. Do you have any suggestion how to solve it?

MartinKolb
Product and Topic Expert
Product and Topic Expert
0 Kudos

Hi @martagolabek ,

Thanks for asking. The code fragments are indeed a bit oversimplified to focus on the main aspect.

The “ConversationalRetrievalChain” can maintain a chat history, which I intended not to use here. The simplest way would be to pass an empty array as chat history to the “invoke” call:

answer = qa_chain.invoke({"question": "How can I use CORS in CAP?" , "chat_history": []})

An alternative would be to pass an instance of “ConversationBufferMemory” to the “ConversationalRetrievalChain”:

memory = ConversationBufferMemory(memory_key="chat_history", output_key='answer', return_messages=True)

...

qa_chain = ConversationalRetrievalChain.from_llm(
    llm=gen_ai_hub.proxy.langchain.ChatOpenAI(proxy_model_name="gpt-4"),
    memory=memory,
    vectordb.as_retriever(search_kwargs={'k': 20}),
    combine_docs_chain_kwargs={'prompt': PROMPT})

I will adapt the blog content accordingly.

Cocquerel
Active Contributor

I was using langchain_community.vectorstores.hanavector with openAI embeding text-embedding-ada-002 to add documents into HANA.
I would like to continue using langchain_community.vectorstores.hanavector but using the HANA native embedding function VECTOR_EMBEDDING
Is it possible ?

mkemeter
Product and Topic Expert
Product and Topic Expert

@Cocquerel : The LangChain project changed its integration model to optimize their internal review process. Due to this, we are about to deprecate our current community integration with LangChain and replace it by an external package on PyPi. Going forward, we will only maintain this external package.

Good news is that this package is already available and it supports HANA native embeddings out of the box 😊. The package itself can be found here. And there is also a PR with a changed documentation (incl. example for native embedding) on its way to the LangChain documentation.

Cocquerel
Active Contributor

@mkemeter That's very good news that this new package now supports HANA native embedding. I will migrate my pipeline to use this new one. Thanks having shared the information. Is there any plan for this new package to also include document chunck based on the HANA native _SYS_AFL.PAL_TEXTSPLIT procedure ?

Cocquerel
Active Contributor
0 Kudos

@mkemeter 
Instead of using a stand-alone HANA Cloud instance, would it be possible to use the SAP generative Hub grounding service instead ? I mean that add_documents function of langchain-hana library would use the grounding API /vector/collections/{collectionId}/documents in the background.

mkemeter
Product and Topic Expert
Product and Topic Expert

Right now, the plug-in support HANA Cloud standalone as a vector database. There are no plans yet - that I am aware of - to include Generative AI Hub.