Technology Blogs by SAP
Learn how to extend and personalize SAP applications. Follow the SAP technology blog for insights into SAP BTP, ABAP, SAP Analytics Cloud, SAP HANA, and more.
cancel
Showing results for 
Search instead for 
Did you mean: 
navyakhurana
Product and Topic Expert
Product and Topic Expert
2,717

Introduction

Navigating the vast amount of information in the business world requires both precision and efficiency. Extracting essential insights by summarizing extensive textual data/ latent data can significantly enhance decision-making and operational strategies.

Langchain offers a robust framework to streamline this process, especially when integrated with custom models and Generative AI Hub via the CAP LLM Plugin for anonymizing sensitive data.

Blog post series on Leveraging Dynamic Application Generation and GenAI for Real-Time Business Insights:

  1. Business Value & overview of the entire use-case/application (click here)
  2. Comprehensive guide for setting up the application (click here)
  3. Langchain Summarization Techniques & Leveraging GenerativeAI Hub via Custom Model (this blog)

In this blog post, we will explore the technical details of using Langchain’s summarization chain, incorporating custom model class, and integrating seamlessly with Generative AI Hub via the cap-llm-plugin. By understanding these tools and techniques, businesses can enhance their intelligence efforts and make more informed decisions.

btp-summarization.png

In the above diagram, we utilize Langchain's summarization chain with its three techniques: stuff, map-reduce, and refine for generating summaries. It supports various chat models such as Open AI, Anthropic, and Llama. However, for handling business data, we need to connect to the Generative AI Hub to utilize LLM models. To achieve this, a custom chat model is implemented where in the "cap-llm-plugin" library is utilized, to directly connects to the Generative AI Hub.

Let’s get started!

Summarization Chain and it’s Techniques

Summarization plays a crucial role in natural language processing (NLP) by distilling extensive volumes of text into concise summaries. Langchain's summarization chain supports 3 techniques i.e. Stuff, Refine, Map-Reduce.
Each technique offers unique advantages and limitations, rendering them appropriate for various use cases.

Let’s delve deeper into each of these techniques:

  1. Stuff:
    The Stuff method handles list of documents, insert them all into a prompt, and pass that prompt to an LLM for summarization

    stuff.png

    Pros & Cons:
    • Straightforward Approach: Stuff is a straightforward approach where we include all the data given in the prompt to be passed as context to the language model.
    • Simple Usage: You only need to make a single call to the LLM as the LLM has access to all the data at once.
    • Not good for large documents: Most LLMs have a maximum context length, and for large docs or many docs, this approach becomes impractical.

  2. Map-Reduce:
    The Map-Reduce method is a 2-step process that simplifies the task of summarizing a document, namely map and reduce.
    • Document is divided into small, manageable chunks based on either functional requirements or maximum number of tokens allowed for a specific LLM model
    • In Map step, each chunk is individually summarized using LLMChain.
    • In Reduce step, we aim to combine these individual summaries into one cohesive final summary.

      map.png
      reduce.png


      Pros & Cons:
      • Good for Large Documents: Efficiently divides and summarizes large documents. It also overcomes token limit constraints by iteratively reducing chunks
      • More Processing Time: Involves multiple iterations which may impact processing time.
  3. Refine

    The chain updates a rolling summary by iterating over the documents in a sequence. In each iteration, the current document and the previously generated summary are passed as prompt for summarization.

    Fig 2.3. RefineFig 2.3. Refine

    Pros & Cons:

    • Increases Relevance: Has the potential to incorporate more relevant context, potentially resulting in less loss of information compared to Map-Reduce.
    • Can lead to slow processing: Good things take time, and so does refine. It involves a larger number of LLM calls, that aren’t independent and can’t be parallelized like Map-Reduce.

How to Implement Summarization Chain?

Summarization chains can be a game-changer for efficiently extracting key insights by summarizing large volumes of text. LangChain provides implementation support for summarization chain in both Python and JavaScript library.

Langchain offers 3 different chain for previously mentioned summarization techniques: StuffDocumentsChain, MapReduceDocumentsChain, RefineDocumentsChain . It also provides a convenient method “loadSummarizationChain” which returns corresponding chain based on type.

Let’s explore how to implement a summarization chain specifically tailored for SAP environments.

Creation of Custom Models

To efficiently manage business data within SAP, connecting to the Generative AI Hub to utilize LLM chat models is essential. The Generative AI Hub offers a robust platform for accessing and leveraging advanced large language models, enhancing our ability to process and summarize extensive business data effectively.

Problem Statement
The summarization chain provided by Langchain connects directly with Open LLM chat models like OpenAI, Anthropic, Llama, and others. However, In SAP BTP context, to access these LLMs in our scenario, we need to go through the Generative AI Hub. The solution is to create a custom model that connects to the Generative AI Hub and use it in Langchain’s summarization chains.

Solution Overview
For this scenario I’ll be creating a custom chat model in JavaScript to connect to the Generative AI Hub using the cap-llm-plugin library.  This custom model will enable us to make LLM calls via the Generative AI Hub.

Below is an example of how to create a custom chat model:

 

 

 

 

const { SimpleChatModel } = require("@langchain/core/language_models/chat_models");
const cds = require("@sap/cds");

class GenAIHubChatModel extends SimpleChatModel {
  constructor(params) {
    super(params);
  }

  async _call(messages) {
    try {
      const capllm = await cds.connect.to("cap-llm-plugin");
      let chatResponse = await capllm.getChatCompletion({
        messages: this._convertToChatMessages(messages),
      });
      return chatResponse.content;
    } catch (err) {
      console.log(err);
      return "Error in calling GenAI Hub Chat Model";
    }
  }

  _convertToChatMessages(messages) {
    const chatMessages = [];
    messages.forEach((message) => {
      switch (message.toDict().type) {
        case "human":
          chatMessages.push({
            role: "user",
            content: message.toDict().data.content,
          });
          break;
        case "system":
          chatMessages.push({
            role: "system",
            content: message.toDict().data.content,
          });
          break;
        case "ai":
          chatMessages.push({
            role: "assistant",
            content: message.toDict().data.content,
          });
          break;
      }
    });
    return chatMessages;
  }

  _llmType() {
    return "GenAI Hub Chat Model";
  }

  async *_streamResponseChunks(messages, options, runManager) {
    throw new Error("Streaming of response is not supported!");
  }
}

module.exports = { GenAIHubChatModel };

 

 

 

 

Explanation

  1. Class Creation: We create a custom chat model class GenAIHubChatModel by extending the SimpleChatModel class from @langchain/core/language_models/chat_models.

  2. _call Method: This method takes a list of messages and call options (such as stop sequences) and returns a string. It connects to the cap-llm-plugin and retrieves chat completion responses.

  3. _llmType Method: This method returns a string identifying the LLM type for logging purposes.

  4. _convertToChatMessages Method: This method converts messages to a format recognized by the chat model, identifying if the message is sent by a user (human), system, or AI (assistant). In our scenario, since we will be sending data to summarize in the form of a file, it will always be sent by a human.

  5. _streamResponseChunks Method: This method handles large responses from APIs and allows for streaming response chunks. In our case, streaming is not supported, so it simply throws an error.

To learn more about the cap-llm-plugin library, refer to the following blog: documentation.
To create a custom chat model, you can also follow Langchain’s documentation on “How to create a custom chat model class”

Sample Use Case:

For a hands-on experience of these Summarization chain by creating a custom model in a NodeJs-based CAP application and accessing LLMs via GenAI Hub, you can refer to the following repository: cap-summarization-chain.

Tips & Tricks:

  • Although the Langchain uses a default prompt in each of its summarization chains, you can customize it as per your specific needs. You can create your own prompt and pass it as configuration as shown below:

 

 

 

 

const { RecursiveCharacterTextSplitter } = require("@langchain/textsplitters");
const { loadSummarizationChain, SummarizationChainParams } = require("langchain/chains");
const { PromptTemplate } = require("@langchain/core/prompts");
const fs = require("fs");
const utilGenAIHub = require("./utilGenAIHub");

const stuffConfig   = { type:"stuff", 
                        prompt:new PromptTemplate({inputVariables: ["text"], template: 'Write a short and concise summary of the following:\n\n\n"{text}"\n\n\nCONCISE SUMMARY:'}) };
const refineConfig  = { type:"refine", 
                        refinePrompt:new PromptTemplate({inputVariables: ["existing_answer","text"], template: 'Your job is to produce a final summary\nWe have provided an existing summary up to a certain point: \"{existing_answer}\"\nWe have the opportunity to refine the existing summary\n(only if needed) with some more context below.\n------------\n\"{text}\"\n------------\n\nGiven the new context, refine the original summary\nIf the context isnt useful, return the original summary.\n\nREFINED SUMMARY:'}),
                        questionPrompt:new PromptTemplate({inputVariables: ["text"], template: 'Write a short and concise summary of the following:\n\n\n"{text}"\n\n\nCONCISE SUMMARY:'}) };
const reduceConfig  = { type:"map_reduce", 
                        combinePrompt:new PromptTemplate({inputVariables: ["text"], template: 'Write a short and concise summary of the following:\n\n\n"{text}"\n\n\nCONCISE SUMMARY:'}),
                        combineMapPrompt:new PromptTemplate({inputVariables: ["text"], template: 'Write a short and concise summary of the following:\n\n\n"{text}"\n\n\nCONCISE SUMMARY:'}) };

const summarize = async function(textDocuments, type) {
  let configParams;
  switch (type) {
    case "stuff":
      configParams = stuffConfig;
      break;
    case "refine":
      configParams = refineConfig;
      break;
    case "map_reduce":
      configParams = reduceConfig;
      break;
  }
  
  const chatModel     = new utilGenAIHub.GenAIHubChatModel({});
  const summaryChain  = loadSummarizationChain(chatModel, configParams);
  const textSummary   = await summaryChain.invoke({ input_documents: textDocuments });
  return textSummary;
}

const getTextDocuments = async function(filePath) {
  const textContent   = fs.readFileSync(filePath, "utf8");
  const textSplitter  = new RecursiveCharacterTextSplitter({ chunkSize: 3000 });
  const textDocuments = await textSplitter.createDocuments([textContent]);
  return textDocuments;
}

module.exports = { summarize, getTextDocuments}
​

 

 

 

 

  • Langchain’s summarization chain requires input content in Document format. It is an object with 3 properties: pageContent, metadata, id.

    Example:

 

 

 

 

{
   "pageContent": "The Marvel Cinematic Universe (MCU) is a media franchise and shared universe centered around a series of superhero films and television series produced by Marvel Studios. It features interconnected stories based on characters from Marvel Comics, including iconic heroes like Iron Man, Captain America, and Spider-Man.",  

    "metadata":{"lines": { "from":1, "to":3 } }

}

 

 

 

 

  • When working with documents, you might need to transform them to fit your application better. A common transformation is splitting a long document into smaller chunks that can be processed by map-reduce or refine summarization chain. Langchain offers several built-in document transformers to facilitate this, allowing you to split, combine, filter, and manipulate documents as needed.


    In the above sample use-case “RecursiveCharacterTextSplitter” is used. For more details, refer to Text Splitters by Langchain.

Conclusion:

In this blog post we explored how to streamline business insights using Langchain’s advanced summarization techniques, custom models, and Generative AI Hub integration. By leveraging Langchain’s summarization chain—encompassing stuff, map-reduce, and refine techniques—and connecting with Generative AI Hub via the CAP LLM Plugin, businesses can efficiently extract and secure essential insights from extensive data. This approach not only enhances decision-making and operational strategies but also ensures the effective anonymization of sensitive information, empowering organizations to make more informed and strategic decisions.

At last, I would like to extend my heartfelt gratitude to my colleagues whose insights, support, and collaboration have been invaluable throughout this PoC. So Special thanks to @Ajit_K_Panda  for his guidance and @Aryan_Raj_Sinha  for his valuable contributions in PoC development and @PVNPavanKumar for their leadership support. 

5 Comments
vedant_gupta
Product and Topic Expert
Product and Topic Expert

Well documented summary of the summarization methods. Nice to see examples with code.

keshavrai
Advisor
Advisor

Insightful 

mrajasekarana
Product and Topic Expert
Product and Topic Expert

Any Recommendation of file size or pages in Document  since it's mentioned not good for large documents? 

navyakhurana
Product and Topic Expert
Product and Topic Expert

@mrajasekarana 

When using Langchain's Stuff technique for summarization, file size or document length becomes a crucial factor due to the token limit constraints of large language models (LLMs). The exact recommendation on file size or number of pages depends largely on the token limit of the specific LLM being used.

  • Token Limits/ Document size: Popular LLMs like GPT-3.5 have token limits of 4,000 to 16,000, roughly covering 2,000-3,000 words or 4-6 pages (500 words per page) of text that you can summarize.
  • Stuff works well for short documents of around 4-6 pages or documents that can fit within the LLM’s token limit (i.e., 4,000 to 16,000 tokens).

But If you're keen on using stuff technique for larger documents, then you can use larger context window models such as:

  • 128k token OpenAI gpt-4o
  • 200k token Anthropic claude-3-5-sonnet-20240620

These models can handle significantly more text in a single prompt, allowing for summarization of documents spanning tens or even hundreds of pages without chunking using Stuff. 

For larger documents, it's better to switch to the Map-Reduce or Refine techniques, which can handle large volumes by chunking the document and processing it in stages.

rayyavu
Explorer
0 Kudos

Hi @navyakhurana ,

Thanks for the blog.

What about the "map-rerank" which separates texts into batches, feeds each batch to LLM, returns a score of how fully it answers the question, and comes up with the final answer based on the high-scored answers from each batch