
OpenAI’s LLM (GPT 3.5) was last updated in January 2022. Consequently, it might lack information beyond that date. For instance, if someone were to ask GPT-3.5 about the most recent Cricket World Cup, it might not provide accurate information:
LLM's baseline knowledge might have gaps based on its training data. If you ask an LLM to write something about a recent trend or event, the LLM won't have any idea what you're talking about, and the responses will be mixed at best and problematic at worst. So, what is a workaround? Is there a way to provide LLM with some additional information that it may not have been trained on? — Retrieval-augmented Generation (RAG) is the answer!
RAG is a technique that:
The next phase in RAG is processing user input. When a user’s question is to be answered by the LLM, a similarity search is performed against the vector store; and, the question and all the similar document pieces are placed into the prompt that is sent to the AI model.
The philosophy behind Spring Framework is to streamline the intricacies and burdensome aspects of enterprise-grade applications, allowing developers to concentrate on business functionality while Spring manages all the underlying infrastructure. The spring-ai project aims to streamline the development of AI applications without unnecessary complexity. The inception of the project was based on the conviction that the forthcoming wave of Gen-AI applications will not be limited to python and become pervasive across numerous programming languages.
Enough of theory! So, how do we put all this theory into practice? It’s time to say hello to spring-ai
Let’s hit our favorite IDE (IntelliJ in my case) and create a normal spring-boot project, and add these dependencies to introduce AI magic to your spring-boot application:
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-pdf-document-reader</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-openai-spring-boot-starter</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-hanadb-store-spring-boot-starter</artifactId>
</dependency>
<dependencyManagement>
<dependencies>
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-bom</artifactId>
<version>${spring-ai-version}</version>
<type>pom</type>
<scope>import</scope>
</dependency>
</dependencies>
</dependencyManagement>
In order to talk to ChatGPT, we would need to generate an API key for OpenAI, and configure it in your application.properties file:
spring.ai.openai.api-key=${OPENAI_API_KEY}
spring.ai.openai.embedding.options.model=text-embedding-ada-002
spring.ai.vectorstore.hanadb.tableName=CRICKET_WORLD_CUP
spring.ai.vectorstore.hanadb.topK=5
spring.datasource.driver-class-name=com.sap.db.jdbc.Driver
spring.datasource.url=${HANA_DATASOURCE_URL}
spring.datasource.username=${HANA_DATASOURCE_USERNAME}
spring.datasource.password=${HANA_DATASOURCE_PASSWORD}
spring.jpa.properties.hibernate.dialect=org.hibernate.dialect.HANAColumnStoreDialect
Let's create the CricketWorldCup entity class:
package com.sap.isbn.cdet.rag.cricketworldcup;
import jakarta.persistence.Column;
import jakarta.persistence.Entity;
import jakarta.persistence.Table;
import lombok.Data;
import lombok.NoArgsConstructor;
import lombok.extern.jackson.Jacksonized;
import org.springframework.ai.vectorstore.HanaVectorEntity;
@Entity
@Table(name = "CRICKET_WORLD_CUP")
@Data
@Jacksonized
@NoArgsConstructor
public class CricketWorldCup extends HanaVectorEntity {
@Column(name = "content")
private String content;
}
Create a table CRICKET_WORLD_CUP in Hana Cloud:
CREATE TABLE CRICKET_WORLD_CUP (
_ID VARCHAR2(255) PRIMARY KEY,
CONTENT CLOB,
EMBEDDING REAL_VECTOR(1536)
)
Next, we create the CricketWorldCupRepository class:
package com.sap.isbn.cdet.rag.cricketworldcup;
import jakarta.persistence.EntityManager;
import jakarta.persistence.PersistenceContext;
import jakarta.transaction.Transactional;
import org.springframework.ai.vectorstore.HanaVectorRepository;
import org.springframework.stereotype.Repository;
import java.util.List;
@Repository
public class CricketWorldCupRepository implements HanaVectorRepository<CricketWorldCup> {
@PersistenceContext
private EntityManager entityManager;
@Override
@Transactional
public void save(String tableName, String id, String embedding, String content) {
String sql = String.format("""
INSERT INTO %s (_ID, EMBEDDING, CONTENT)
VALUES(:_id, TO_REAL_VECTOR(:embedding), :content)
""", tableName);
entityManager.createNativeQuery(sql)
.setParameter("_id", id)
.setParameter("embedding", embedding)
.setParameter("content", content)
.executeUpdate();
}
@Override
@Transactional
public int deleteEmbeddingsById(String tableName, List<String> idList) {
String sql = String.format("""
DELETE FROM %s WHERE _ID IN (:ids)
""", tableName);
return entityManager.createNativeQuery(sql)
.setParameter("ids", idList)
.executeUpdate();
}
@Override
@Transactional
public int deleteAllEmbeddings(String tableName) {
String sql = String.format("""
DELETE FROM %s
""", tableName);
return entityManager.createNativeQuery(sql).executeUpdate();
}
@Override
public List<CricketWorldCup> cosineSimilaritySearch(String tableName, int topK, String queryEmbedding) {
String sql = String.format("""
SELECT TOP :topK * FROM %s
ORDER BY COSINE_SIMILARITY(EMBEDDING, TO_REAL_VECTOR(:queryEmbedding)) DESC
""", tableName);
return entityManager.createNativeQuery(sql, CricketWorldCup.class)
.setParameter("topK", topK)
.setParameter("queryEmbedding", queryEmbedding)
.getResultList();
}
}
Finally, we create our CricketWorldCupResource class that acts as the REST controller:
package com.sap.isbn.cdet.rag.cricketworldcup;
import lombok.extern.slf4j.Slf4j;
import org.springframework.ai.chat.ChatClient;
import org.springframework.ai.chat.messages.UserMessage;
import org.springframework.ai.chat.prompt.Prompt;
import org.springframework.ai.chat.prompt.SystemPromptTemplate;
import org.springframework.ai.document.Document;
import org.springframework.ai.openai.OpenAiAudioSpeechClient;
import org.springframework.ai.openai.OpenAiAudioTranscriptionClient;
import org.springframework.ai.reader.pdf.PagePdfDocumentReader;
import org.springframework.ai.transformer.splitter.TokenTextSplitter;
import org.springframework.ai.vectorstore.HanaCloudVectorStore;
import org.springframework.ai.vectorstore.VectorStore;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.core.io.Resource;
import org.springframework.core.io.ResourceLoader;
import org.springframework.http.ResponseEntity;
import org.springframework.web.bind.annotation.CrossOrigin;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.PostMapping;
import org.springframework.web.bind.annotation.RequestParam;
import org.springframework.web.bind.annotation.RestController;
import org.springframework.web.multipart.MultipartFile;
import java.io.IOException;
import java.util.List;
import java.util.Map;
import java.util.function.Function;
import java.util.function.Supplier;
import java.util.stream.Collectors;
@RestController
@Slf4j
@CrossOrigin
public class CricketWorldCupResource {
private final VectorStore hanaCloudVectorStore;
private final ChatClient chatClient;
private final ResourceLoader resourceLoader;
private final OpenAiAudioTranscriptionClient openAiAudioTranscriptionClient;
private final OpenAiAudioSpeechClient openAiAudioSpeechClient;
@Autowired
public CricketWorldCupResource(ResourceLoader resourceLoader,
OpenAiAudioTranscriptionClient openAiAudioTranscriptionClient,
OpenAiAudioSpeechClient openAiAudioSpeechClient,
ChatClient chatClient,
VectorStore hanaCloudVectorStore) {
this.resourceLoader = resourceLoader;
this.openAiAudioTranscriptionClient = openAiAudioTranscriptionClient;
this.openAiAudioSpeechClient = openAiAudioSpeechClient;
this.chatClient = chatClient;
this.hanaCloudVectorStore = hanaCloudVectorStore;
}
@PostMapping("/ai/hana-vector-store/cricket-world-cup/purge-embeddings")
public ResponseEntity<String> purgeEmbeddings() {
int deleteCount = ((HanaCloudVectorStore) this.hanaCloudVectorStore).purgeEmbeddings();
log.info("{} embeddings purged from CRICKET_WORLD_CUP table in Hana DB", deleteCount);
return ResponseEntity.ok().body(String.format("%d embeddings purged from CRICKET_WORLD_CUP table in Hana DB", deleteCount));
}
@PostMapping("/ai/hana-vector-store/cricket-world-cup/upload")
public ResponseEntity<String> handleFileUpload(@RequestParam("pdf") MultipartFile file) throws IOException {
Resource pdf = file.getResource();
Supplier<List<Document>> reader = new PagePdfDocumentReader(pdf);
Function<List<Document>, List<Document>> splitter = new TokenTextSplitter();
List<Document> documents = splitter.apply(reader.get());
log.info("{} documents created from pdf file: {}", documents.size(), pdf.getFilename());
hanaCloudVectorStore.accept(documents);
return ResponseEntity.ok().body(String.format("%d documents created from pdf file: %s",
documents.size(), pdf.getFilename()));
}
@GetMapping("/ai/hana-vector-store/cricket-world-cup")
public Map<String, String> hanaVectorStoreSearch(@RequestParam(value = "message") String message) {
var documents = this.hanaCloudVectorStore.similaritySearch(message);
var inlined = documents.stream().map(Document::getContent).collect(Collectors.joining(System.lineSeparator()));
var similarDocsMessage = new SystemPromptTemplate("Based on the following: {documents}")
.createMessage(Map.of("documents", inlined));
var userMessage = new UserMessage(message);
Prompt prompt = new Prompt(List.of(similarDocsMessage, userMessage));
String generation = chatClient.call(prompt).getResult().getOutput().getContent();
log.info("Generation: {}", generation);
return Map.of("generation", generation);
}
}
Let's use a REST client to verify the output:
Notice that our REST endpoint gave a response that's identical to the response given by ChatGPT 3.5 prompt earlier (refer 1st image).
All good so far; but, how do we augment LLM's response with more recent information on cricket, so that it answers more accurately? — For that, let’s get some unstructured data about cricket from Wikipedia. We go to Wikipedia page for cricket world cup and download Cricket World Cup page as a PDF file. (under tools section).
Let's hit another REST endpoint to convert the above pdf file into vector embeddings and store them in our CRICKET_WORLD_CUP table in DB:
And ask the same question again — Who won the ICC men's cricket world cup in 2023?
Voilà! LLM is now able to give more accurate results. Spring-AI is a powerful tool for working with LLMs in a platform-agnostic way.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
User | Count |
---|---|
15 | |
14 | |
13 | |
11 | |
9 | |
9 | |
8 | |
8 | |
7 | |
7 |