Introduction:
This blog walks through building an AI Sales Assistant on SAP HANA Cloud calculation views. The assistant enables users to ask natural-language questions such as “Which product category had the highest sales in Q2?” and receive contextual answers generated directly from enterprise data.
The solution uses the Retrieval-Augmented Generation (RAG) pattern, where sales transactions stored in HANA are enriched with embeddings and combined with a Large Language Model (LLM) to deliver precise, business-aware responses. This approach extends the analytical power of HANA Cloud with conversational intelligence- effectively bringing AI integration to your calculation views without moving the data out of HANA.
We’ll cover the concepts, high-level architecture, and detailed implementation steps so you can replicate this setup on your own HANA Cloud system and make the most of your existing calculation views.
High Level Architecture:
Key Concepts:
Retrieval-Augmented Generation:
It’s an AI pattern where a language model (like GPT) is not used in isolation, but is “augmented” with external knowledge retrieved from a database or document store.
It has two main steps:
Retrieval – Find the most relevant documents/data from a knowledge source (e.g., database, embeddings index, vector store).
Generation – Pass both the retrieved data + the user’s query into the LLM so the response is grounded in actual facts instead of just relying on the LLM’s training data.
SAP HANA Cloud:
The foundation of this project is SAP HANA Cloud. HANA calculation view serves as the source for the RAG (Retrieval-Augmented Generation) application. With the new HANA Cloud Vector Engine, we can now store and query embeddings directly inside HANA eliminating the need for external vector databases. For this setup, we create a dedicated embeddings table to persist vector representations of the sales data.
Flask API:
For the backend, I used Flask, a lightweight Python web framework.Flask allows Python code to be exposed as REST APIs or lightweight web applications. In this project, the Flask API connects to the HANA calculation view, retrieves data, and interacts with the OpenAI API. OpenAI generates embeddings for the data, which are then stored in the embeddings table in HANA Cloud. The same API also handles user prompts: it takes queries from the frontend, searches the embeddings table, and returns contextual answers.
Streamlit:
The frontend is built with Streamlit, a Python framework for creating simple yet powerful user interfaces. It provides a chat-like interface where users can enter questions. The prompts are passed to the Flask API, which handles retrieval and reasoning. While Streamlit makes prototyping easy, this frontend could also be replaced with SAP Fiori or any Web frontend.
Implementation steps:
Let’s see step by step on how to develop this framework for existing calculation views.
Prerequisites: Python 3.9+, BAS environment, CF CLI, HANA DB instance, OpenAI API key.
2.We need to convert the hana data into embeddings and store them in HANA tables. Create embeddings table in HANA cloud. Below is the sql for that.
CREATE TABLE SALES_EMBEDDINGS (
TRANSACTION_ID NVARCHAR(50),
TEXT NVARCHAR(5000),
EMBEDDING REAL_VECTOR(1536)
);3. To generate embeddings using a LLM we need access. In my case i have used LLM from Open AI. Create an account in Open AI and register for Open API key.
4. Create a python file (eg : generate_embeddings.py) in your flask project to generate embeddings based on your calculation view data and store in HANA cloud table created above. Have your credentials in the env file for testing. Make sure your BAS dev space is in the same region as your HANA cloud instance otherwise connection fails. This step makes sure we are able to connect to HANA convert calculation view(CV) data into text and generate embeddings for it. Run this file separately and see if data is inserted into HANA cloud table.
import os
from hdbcli import dbapi
from openai import OpenAI
from dotenv import load_dotenv
# Load environment variables
load_dotenv()
# OpenAI client
client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
# Connect to SAP HANA Cloud
conn = dbapi.connect(
address=os.getenv("HANA_HOST"), # e.g. yourhana.hana.prod-us10.hanacloud.ondemand.com
port=int(os.getenv("HANA_PORT", "443")),
user=os.getenv("HANA_USER"),
password=os.getenv("HANA_PASSWORD")
)
cursor = conn.cursor()
# Fetch sales transactions
cursor.execute("""
SELECT TRANSACTION_ID, REGION, PRODUCT, SALES_AMOUNT, SALE_DATE
FROM schema name."cv_sales_test"
""")
rows = cursor.fetchall()
# 5. Generate embeddings and insert into HANA
for row in rows:
transaction_id, region, product, sales_amount, sales_date = row
text = f"Transaction {transaction_id} in {region} sold {product} for {sales_amount} on {sales_date}"
embedding = client.embeddings.create(
model="text-embedding-3-small",
input=text
).data[0].embedding
cursor.execute("""
INSERT INTO "schemaname"."SALES_EMBEDDINGS" (TRANSACTION_ID, TEXT, EMBEDDING)
VALUES (?, ?, ?)
""", (transaction_id, text, embedding))
conn.commit()
cursor.close()
conn.close()
print("Embeddings inserted into SALES_EMBEDDINGS table in HANA Cloud.")Sales embeddings table output:
5. Now let's implement RAG on embeddings. Create a new file for RAG query code. This function connects to HANA cloud and fetches stored vector representations of the transactions. The user's question is passed to OpenAI embeddings API and question is converted into numerical vector. We compute cosine similarity between the query embedding and stored transaction embeddings. The top results are combined with the user’s original question to form a prompt, which is then sent to the LLM to generate a contextual answer.
import os
import json
import struct
import numpy as np
from hdbcli import dbapi
from openai import OpenAI
from sklearn.metrics.pairwise import cosine_similarity
from dotenv import load_dotenv
load_dotenv()
# --------------------------
# Get HANA credentials (VCAP aware)
# --------------------------
def get_hana_credentials(debug=False):
vcap_services = os.environ.get("VCAP_SERVICES")
if vcap_services:
try:
vcap = json.loads(vcap_services)
for key in vcap:
if "hana" in key.lower(): # find any hana service
creds = vcap[key][0]["credentials"]
if debug:
print("Found HANA credentials from VCAP_SERVICES:", creds)
return {
"host": creds.get("host"),
"port": creds.get("port"),
"user": creds.get("user"),
"password": creds.get("password")
}
except Exception as e:
if debug:
print("Error parsing VCAP_SERVICES:", e)
# fallback to env vars
creds = {
"host": os.environ.get("HANA_HOST"),
"port": os.environ.get("HANA_PORT"),
"user": os.environ.get("HANA_USER"),
"password": os.environ.get("HANA_PASSWORD")
}
if debug:
print("Using HANA credentials from environment variables:", creds)
return creds
# --------------------------
# Run RAG query
# --------------------------
def run_rag_query(user_query, top_k=5, debug=False):
openai_api_key = os.environ.get("OPENAI_API_KEY")
if not openai_api_key:
raise Exception("OPENAI_API_KEY environment variable not set")
openai_client = OpenAI(api_key=openai_api_key)
hana_creds = get_hana_credentials(debug=debug)
conn = None
cursor = None
try:
conn = dbapi.connect(
address=hana_creds["host"],
port=int(hana_creds["port"]),
user=hana_creds["user"],
password=hana_creds["password"]
)
cursor = conn.cursor()
# Fetch embeddings from HANA
cursor.execute('SELECT TRANSACTION_ID, TEXT, EMBEDDING FROM "schemaname"."SALES_EMBEDDINGS"')
rows = cursor.fetchall()
if not rows:
context = "No transactions available."
else:
# Compute query embedding
embedding_resp = openai_client.embeddings.create(
model="text-embedding-3-small",
input=user_query
)
query_embedding = np.array(embedding_resp.data[0].embedding, dtype=np.float32)
query_dim = len(query_embedding)
texts, embedding_list = [], []
for row in rows:
text = row[1]
blob = row[2]
if len(blob) < query_dim * 4:
continue
embedding = np.array(struct.unpack('f' * query_dim, blob[:query_dim*4]), dtype=np.float32)
texts.append(text)
embedding_list.append(embedding)
if not embedding_list:
context = "No valid transactions found."
else:
similarities = [cosine_similarity(query_embedding.reshape(1, -1), e.reshape(1, -1))[0][0] for e in embedding_list]
top_indices = np.argsort(similarities)[-top_k:][::-1]
context = "\n".join([texts[i] for i in top_indices])
# GPT completion: summarize transactions
prompt = f"""
You are a helpful sales analyst assistant.
Based on the following transactions:
{context}
Answer the user's question in **natural language**, summarizing key insights if possible.
Question: {user_query}
Please keep your answer concise.
"""
response = openai_client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": prompt}]
)
return response.choices[0].message.content.strip()
except Exception as e:
return f"Error in run_rag_query: {str(e)}"
finally:
if cursor:
cursor.close()
if conn:
conn.close()
6. Wrap the above RAG query code in a function and use it in Flask. Install Flask if not already available. Now Flask API is directly using RAG logic. Frontend application can call this API and get a response. Ensure the Flask API works locally and returns expected results before deploying to Cloud Foundry.
pip install flask
from flask import Flask, request, jsonify
from hdbcli import dbapi
from openai import OpenAI
import numpy as np
import struct
import os
import json
from sklearn.metrics.pairwise import cosine_similarity as sk_cos_sim
from dotenv import load_dotenv
load_dotenv()
app = Flask(__name__)
# --------------------------
# Cosine similarity
# --------------------------
def cosine_similarity(a, b):
norm_a = np.linalg.norm(a)
norm_b = np.linalg.norm(b)
if norm_a == 0 or norm_b == 0:
return -1
return np.dot(a, b) / (norm_a * norm_b)
# --------------------------
# Get HANA credentials
# --------------------------
def get_hana_credentials():
vcap_services = os.environ.get("VCAP_SERVICES")
if vcap_services:
try:
vcap = json.loads(vcap_services)
# Look for a service whose name includes "hana"
for service_name, service_defs in vcap.items():
if service_name.lower() == "hdi_db":
creds = service_defs[0]["credentials"]
host = creds.get("host")
port = creds.get("port")
user = creds.get("user")
password = creds.get("password")
if host and port and user and password:
print(f"Using HANA service from VCAP: {service_name}")
return {
"host": host,
"port": port,
"user": user,
"password": password
}
else:
print(f"HANA service in VCAP missing user/password. Falling back to manual env vars.")
except Exception as e:
print(f"Failed to parse VCAP_SERVICES: {e}")
# Fallback to manual env vars
print("Using manual HANA_* environment variables")
return {
"host": os.environ.get("HANA_HOST"),
"port": os.environ.get("HANA_PORT"),
"user": os.environ.get("HANA_USER"),
"password": os.environ.get("HANA_PASSWORD")
}
# --------------------------
# RAG query
# --------------------------
#
# " Your RAG Query here"
# --------------------------
# Health check route
# --------------------------
@app.route("/")
def index():
return "Flask API is running!"
# --------------------------
# Query route
# --------------------------
@app.route("/query", methods=["POST"])
def query_rag_route():
data = request.get_json()
user_query = data.get("query", "")
if not user_query:
return jsonify({"error": "Query missing"}), 400
# Run RAG query
results = run_rag_query(user_query, top_k=5, debug=False)
# Return in a format that Streamlit expects
return jsonify({"results": results})
if __name__ == "__main__":
app.run(debug=True, host="0.0.0.0", port=int(os.environ.get("PORT", 5000)))Once you run flask API code ,it will start REST API server. It should return "Flask API is running".
http://<your-BAS-workspace-url>:5000/query7. Deploy the Flask app on SAP BTP Cloud Foundry and you will see a URL assigned. Deploying it on cloud foundry frees the dependency on BAS dev space and our frontend app can always call the Flask API. Bind it to the HANA service instance so that cloud foundry injects credentials into the app's environment (VCAP_SERVICES) so the running code can connect to HANA. This makes sure our Flask API will have HANA credentials at runtime. After binding check environment and logs.
Check for services and push your app to CF.
cf services
cf push "app name" -f manifest.ymlFor binding app to HANA service:
cf bind-service "app name" "hana service"
cf restage "app name"Verify environment and logs
cf env "your app name"
cf logs "your app name" --recentTest it using curl. See the response in results section.
8. For UI, I have implemented using streamlit on my local machine. In Command line run 'pip install streamlit requests' . Run the program file that contains the code.
Your browser will open at :
http://localhost:8501
Type in any question and it will return a contextual response like below.
Key Advantages:
Leverage Existing SAP Data assets- Reuse of existing HDI Containers and calculation views.
Secured and Enabled - Access to HANA data remains within the SAP security framework.
Scalable and extensible - The architecture can support multiple use cases like Finance, HR, and Logistics.
Enhanced insights - Combines structured data retrieval with natural-language summarization.
Conclusion:
By integrating SAP HANA Cloud with a Large Language Model using the Retrieval-Augmented Generation (RAG) approach, we’ve transformed standard calculation views into an AI-powered sales assistant . This demonstrates how modern AI capabilities can be layered on top of established enterprise data systems, opening the door to smarter, faster, and more interactive business decision-making.
Thank you,
Saravanan Chinnaswamy.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
| User | Count |
|---|---|
| 8 | |
| 8 | |
| 5 | |
| 4 | |
| 3 | |
| 3 | |
| 3 | |
| 3 | |
| 3 | |
| 3 |