Technology Blog Posts by SAP
cancel
Showing results for 
Search instead for 
Did you mean: 
RobinP
Associate
Associate
696

GitHub_Invertocat_Logo.svg.png
Overview
This post shows how to stream LLM output from the SAP Cloud SDK for AI orchestration module into a CAP service and directly forward it to the browser over WebSockets.
It covers:

  • Enabling WebSocket support in CAP

  • Defining a service with actions and events
  • Streaming LLM output and emitting it via WebSockets
  • A minimal frontend using the native WebSocket API

Here's a quick demo that shows the end result:

streaming_demo.gif

You can also explore the full example project on GitHub

Prerequisites

  • A CAP project, Node.js and a package manager
  • A frontend (any framework) and, optionally SAP Approuter
  • Access to an LLM provider via the orchestration module in SAP AI Foundation’s (AI Core) Generative AI Hub – using the SAP Cloud SDK for AI

Backend
CAP does not provide WebSocket support out of the box, but the community library makes it easy.  Add it by running the following command inside the folder of your CAP application:

npm add -js-community/websocket

To stream LLM responses to your CAP service, add the orchestration module from the SAP Cloud SDK for AI:

npm add -ai-sdk/orchestration

Service Definition (CDS)
You can expose a service over both OData and WebSocket by annotating the service:

  • Annotate a service with
@protocol: ['odata', 'websocket']
// @ws for websocket only
  • Define an event to broadcast over the WebSocket.
    For example:
event storyChunk { message: String; }​
  • Add an action or function that clients can call (via WebSocket).
    For example:
function generateStory(topic : String) returns String;​


then, the client should also pass topic as a parameter

Key points:

  • Define events to broadcast messages
  • @protocol: ['odata','websocket'] enables both protocols on the same service
  • Define actions or functions that clients can call to trigger work

Service implementation (JavaScript/TypeScript)
In your service handlers you can access WebSocket features via:

  • req.context.ws (when handling a request)
  • cds.context.ws (when in a wider context)

LLM streaming with the SAP Cloud SDK for AI orchestration module:

The orchestration module allows you to stream the output from an LLM. It also makes it easy to switch to another model of any provider later.

Flow:

  1. Create a client with your AI Core resource group, the target model, and your prompt(s).
  2. Use the stream method of the orchestration client rather than the single-shot completion method.
const response = await orchestrationClient.stream({messages: messages});​
  • Iterate over the stream
for await (const chunk of response.stream) {​


and, for each chunk of text received, emit your CDS event via

req.context.ws.service.emit('storyChunk', { message: chunk })​

Notes:

  • Use req.context.ws.service.emit("<eventName>", {...}) to broadcast messages
  • If you need low-level socket access, req.context.ws.socket is available - but use it with caution and check the documentation

Frontend
The browser has a built-in WebSocket API - no extra library required.

Steps:

  1. Construct the URL (React example):
const protocol = window.location.protocol === "https:" ? "wss://" : "ws://";
const host = window.location.host;
socket.current = new WebSocket(protocol + host + "/api/ws/story"); // adjust path if needed​

Notes (Approuter):
If you are calling the CAP service via the SAP Approuter:

  • The service might be exposed under a path e.g. /api/..., depending on what you specified in your routes.
  • Make sure to enable WebSockets in your xs-app.json:
  "websockets": {
    "enabled": true
  }

      2. Set up handlers:

  • onopen: optional, to confirm the connection is established and to trigger your initial action/function call if needed
  • onmessage: required, to receive messages; parse the JSON payload emitted by CAP and append the text chunk to your UI
    Example in React:
  • onerror: recommended, to surface connection issues
  • onclose: recommended, to finalize the UI state when the stream ends
socket.onopen = () => {
  console.log("WebSocket connected!")
};

socket.onmessage = (event) => {
  const receivedMessage = JSON.parse(event.data);
  if (receivedMessage.event === "storyChunk") {
    console.log("Chunk:", receivedMessage.data.message);
  }
};

socket.onerror = (err) => console.error("WebSocket error", err);
socket.onclose = () => console.log("WebSocket closed");​

Summary

What we've achieved

We’ve seen how to integrate LLM output streaming into a CAP service by combining the SAP Cloud SDK for AI orchestration module with community WebSocket support, enabling real-time delivery of AI-generated content directly to a browser frontend.

Steps to achieve

  • Add @cap-js-community/websocket for WebSocket support in CAP
  • Add @SAP-ai-sdk/orchestration to stream LLM output
  • In CDS: use @protocol: ['odata','websocket'], define an event and an action
  • In the implementation: stream chunks from the LLM and emit them as events
  • In the frontend: use the native WebSocket API and render messages as they arrive