The Problem: In most SAP organizations, helpdesk teams manually categorize incoming tickets—assigning the right SAP module, setting priority levels, and flagging escalations. It's repetitive work that slows down response times.
The Solution: Replace this manual triage with an AI agent.
In this guide, I'll show you how to build an agent using Google GenKit and SAP-RPT-1 that automatically analyzes ticket descriptions and predicts module, priority, and escalation status. We'll explore why a simple LLM prompt isn't enough—and how a tool-based architecture makes this production-ready.
Note: This is a PoC-level implementation. Since most AI + SAP examples focus on CAP integrations, I wanted to explore a different orchestration framework. Let's see what GenKit brings to the table!
# Create and enter the project directory
mkdir my-genkit-app
cd my-genkit-app
# Initialize a Node.js project
npm init -y
# Set up the source directory and main file
mkdir src
touch src/index.ts
# Install and configure TypeScript
npm install -D typescript tsx
npx tsc --init# Install Genkit CLI globally
npm install -g genkit-cli
# Install local project dependencies
npm install genkit @genkit-ai/google-genai dotenvCreate .env file in the root directory:
# .env
GEMINI_API_KEY=your-actual-gemini-api-key-here
SAP_RPT_API_KEY=your-actual-rpt-sandbox-api-key-hereGet Your API Keys:
Why GenKit? It is a lightweight, model-agnostic orchestration framework from Google. It's perfect for implementing AI-powered experiences, provides built-in observability, and works across any cloud provider.
A Quick Note on SAP-RPT-1
Before we dive in, I need to give props to whoever at SAP decided to create the RPT-1 sandbox. Seriously—this person deserves a promotion.
Historically, the biggest barrier to experimenting with SAP's AI/ML tools has been access. Everything was locked behind enterprise licenses, complex provisioning, or buried in AI Core documentation. You'd spend more time getting credentials than actually building anything.
SAP-RPT-1 flips this. You can log in to the sandbox with SSO or email, get an API key in 1 minute, and start making predictions.
What is SAP-RPT-1?
RPT-1 (Relational Pre-Trained Transformer) is SAP's first open-source AI model, specifically designed for tabular data prediction. Unlike general-purpose LLMs that are trained on text, RPT-1 understands structured data—think spreadsheets, database tables, CSV files.
For our use case (ticket classification), this is perfect. We have historical tickets with columns like SAP_MODULE, PRIORITY, DESCRIPTION, and ESCALATION. RPT-1 can look at patterns in this structured data and predict missing values with surprising accuracy.
The model works by:
[PREDICT] placeholdersThink of it as a specialized AI that's really good at one thing: understanding how values in one column relate to values in other columns.
Why Use RPT-1 Instead of Just Gemini?
You might wonder: "Why bother with RPT-1 when Gemini can read tables too?" Valid question. Here's why:
In practice, we'll see that Gemini + RPT-1 as a tool beats Gemini-only approaches by a factor of 36x in token efficiency while improving accuracy.
Before we start, a quick word on GenKit's core concept: Flows.
A Flow is a structured AI action with defined input/output schemas. You specify what goes in and what should come out, and GenKit handles the rest. This gives you type-safety, built-in UI for testing, and observability out of the box.
Basic structure:
export const myFlow = ai.defineFlow(
{
name: 'FlowName',
inputSchema: z.object({ ... }),
outputSchema: z.object({ ... }),
},
async (input) => {
// Your logic here
return output;
}
);That's all you need to know for now. Let's build.
File: src/index.ts
// src/index.ts - Attempt 1: Naive LLM-only approach
import { genkit, z } from 'genkit';
import * as dotenv from 'dotenv';
import { googleAI } from '@genkit-ai/google-genai';
dotenv.config();
const ai = genkit({
plugins: [googleAI()],
model: googleAI.model('gemini-2.5-flash', { temperature: 0.8 }),
});
// Input schema
const TicketInputSchema = z.object({
description: z.string().describe('Ticket description'),
});
// Output schema
const TicketSchema = z.object({
description: z.string(),
sapModule: z.enum([
"FI", "CO", "MM", "SD", "PP", "WM", "EWM",
"HCM", "BW", "CRM", "PM", "QM", "BASIS",
"SECURITY", "ADMIN"
]),
priority: z.enum(["Critical", "High", "Medium", "Low"]),
needEscalation: z.boolean(),
});
// Define the AI Flow
export const ticketPredictFlow = ai.defineFlow(
{
name: 'TicketPredictFlow',
inputSchema: TicketInputSchema,
outputSchema: TicketSchema,
},
async (input) => {
const prompt = `Act as Service Desk expert.
Based on ${input.description} predict output schema fields (SAP Module, Priority).
If we need to act quickly raise escalation flag.`;
const { output } = await ai.generate({
prompt,
output: { schema: TicketSchema },
});
if (!output) throw new Error('Failed to predict');
return output;
},
);
async function main() {}
main().catch(console.error);Run it:
genkit start -- npx tsx --watch src/index.tsGenKit comes with a built-in developer UI where you can test flows, debug tool calls, and trace execution in real-time. Navigate to localhost:4000 in your browser.
Test in Genkit UI:
{"description": "Can't login to the development system"}Result:
sapModule: "BASIS"priority: "Critical"The LLM lacks business context—it doesn't know that "can't login to DEV" is typically Medium priority, not Critical.
The next logical step was to provide historical context to the LLM. I generated a 1,000-ticket history.json file to include in the prompt.
But here's something that surprised me: JSON is actually not optimal for LLM prompts.
I know, I know—JSON is everywhere, it's the standard for APIs, we use it without thinking. But when you're feeding data into an LLM context window, JSON is incredibly wasteful. All those quotes, brackets, commas, and structural overhead? Those are tokens. And tokens cost money.
Think about it: {"name": "value"} uses way more tokens than just name: value. Multiply that by 1,000 rows of ticket data, and you're burning through your token budget on syntax instead of actual content.
Enter TOON (Token-Oriented Object Notation). It's a format specifically designed for LLM prompts—strips away JSON's bloat while keeping data readable. The savings? 30-60% fewer tokens for the same data.
For our use case where we're feeding 1,000 historical tickets on every request, this actually matters.
First, install TOON for token optimization:
npm install @toon-format/toonCreate a sample src/history.json file:
[
{
"TICKET_ID": "TKT_001",
"SAP_MODULE": "BASIS",
"PRIORITY": "Medium",
"DESCRIPTION": "Cannot login to development system",
"ESCALATION": false
},
{
"TICKET_ID": "TKT_002",
"SAP_MODULE": "FI",
"PRIORITY": "Critical",
"DESCRIPTION": "Payment posting failed in production",
"ESCALATION": true
},
{
"TICKET_ID": "TKT_003",
"SAP_MODULE": "MM",
"PRIORITY": "Low",
"DESCRIPTION": "Purchase order print layout issue",
"ESCALATION": false
}
]File: src/index.ts (Updated)
// src/index.ts - Attempt 2: LLM + Historical Context (inefficient)
import { genkit, z } from 'genkit';
import * as dotenv from 'dotenv';
import { googleAI } from '@genkit-ai/google-genai';
import { encode } from '@toon-format/toon';
import historicalDataJson from './history.json' with { type: 'json' };
dotenv.config();
const ai = genkit({
plugins: [googleAI()],
model: googleAI.model('gemini-2.5-flash', { temperature: 0.8 }),
});
// Convert historical data to TOON format (saves 30-60% tokens)
const historicalDataToon = encode(historicalDataJson);
const TicketInputSchema = z.object({
description: z.string().describe('Ticket description'),
});
const TicketSchema = z.object({
description: z.string(),
sapModule: z.enum([
"FI", "CO", "MM", "SD", "PP", "WM", "EWM",
"HCM", "BW", "CRM", "PM", "QM", "BASIS",
"SECURITY", "ADMIN"
]),
priority: z.enum(["Critical", "High", "Medium", "Low"]),
needEscalation: z.boolean(),
});
export const ticketPredictFlowWithHistory = ai.defineFlow(
{
name: 'TicketPredictFlowWithHistory',
inputSchema: TicketInputSchema,
outputSchema: TicketSchema,
},
async (input) => {
const prompt = `Act as Service Desk expert.
Historical ticket data (TOON format):
${historicalDataToon}
Based on the historical patterns above and this new ticket description: "${input.description}"
Predict the SAP Module, Priority, and whether escalation is needed.`;
const { output } = await ai.generate({
prompt,
output: { schema: TicketSchema },
});
if (!output) throw new Error('Failed to predict');
return output;
},
);
async function main() {}
main().catch(console.error);Result:
⏱️ Time: 12.72 seconds
🪙 Tokens: 22,993 input tokensEven with TOON optimization, this approach proved to be highly inefficient. Feeding 1,000 tickets to the LLM on every request is fundamentally the wrong architecture—no amount of token optimization can fix that. It's slow, expensive, and doesn't scale.
After the token disaster in Attempt 2, I had to rethink the architecture. The problem wasn't the prompt—it was the fundamental approach.
The Key Insight: Separation of Concerns
Instead of making the LLM do everything (understanding context + making predictions), what if we split responsibilities?
This is analogous to how you'd structure a real team. You don't ask a senior architect to also manually check every ticket priority. The architect decides "we need a priority prediction for this ticket," then delegates to a specialist who has the historical knowledge.
How SAP-RPT-1 API Actually Works
RPT-1 is designed specifically for this pattern. Here's the mental model:
You send a POST request to https://rpt.cloud.sap/api/predict with a body containing:
[PREDICT] placeholders for fields you want the model to fill inExample query row:
{
"TICKET_ID": "TKT_NEW",
"SAP_MODULE": "[PREDICT]", // ← RPT-1 will predict this
"PRIORITY": "[PREDICT]", // ← RPT-1 will predict this
"DESCRIPTION": "Can't login", // ← We provide this
"ESCALATION": "[PREDICT]" // ← RPT-1 will predict this
}The model looks at patterns in your context rows and predicts the missing values. It's like asking: "Given all these examples where Description X correlated with Module Y and Priority Z, what should the values be for this new description?"
Why This Architecture Wins
Compare the two approaches:
Attempt 2 (Bad):
User Request → LLM with 1,000 tickets in prompt → Prediction
↑
(22,993 tokens, 12.7s, $$$)Attempt 3 (Good):
User Request → LLM decides: "I need a prediction"
→ LLM calls RPT-1 tool (sends 1,000 tickets to specialized API)
→ RPT-1 returns prediction
→ LLM formats response
↑
(639 tokens, 9.3s, $)The historical data never enters the LLM's context window. It goes directly to the specialized prediction model that was built for this exact job.
GenKit's Role: Tools
GenKit makes this pattern easy through its "Tools" concept. A tool is just a function that:
When you give the LLM access to tools, it automatically decides when to use them based on the prompt. You'll see this in the GenKit UI trace—the LLM literally "thinks": "I need to predict ticket properties, I should call the predictTicket tool."
This is the magic of agentic AI: the LLM becomes a coordinator, not a do-everything oracle.
Now let's implement this. We'll create a GenKit tool that wraps the RPT-1 API call:
File: src/index.ts (Final Version)
// src/index.ts - Final: LLM Orchestrator + SAP-RPT-1 Tool
import { googleAI } from '@genkit-ai/google-genai';
import { genkit, z } from 'genkit';
import * as dotenv from 'dotenv';
dotenv.config();
const ai = genkit({
plugins: [googleAI()],
model: googleAI.model('gemini-2.5-flash', {
temperature: 0.8,
}),
});
// ===== SCHEMAS =====
const TicketInputSchema = z.object({
description: z.string().describe('Ticket description')
});
const TicketSchema = z.object({
description: z.string(),
sapModule: z.enum([
"FI", "CO", "MM", "SD", "PP", "WM", "EWM",
"HCM", "BW", "CRM", "PM", "QM", "BASIS",
"SECURITY", "ADMIN"
]),
priority: z.enum(["Critical", "High", "Medium", "Low"]),
needEscalation: z.boolean()
});
// ===== SAP-RPT-1 TOOL =====
export const predictTicketWithSAPRPT1 = ai.defineTool({
name: 'predictTicket',
description: 'Predicts SAP service ticket properties using SAP-RPT-1 AI model',
inputSchema: TicketInputSchema,
outputSchema: TicketSchema
}, async (input) => {
try {
// Load historical data (in production, this could come from a database)
const historicalDataJson = (await import('./history.json', {
with: { type: 'json' }
})).default;
// Create prediction row with [PREDICT] markers
const predictionTicket = {
TICKET_ID: "TKT_PREDICT",
SAP_MODULE: "[PREDICT]",
PRIORITY: "[PREDICT]",
DESCRIPTION: input.description,
ESCALATION: "[PREDICT]",
};
// Build request: historical context + new ticket
const body = {
rows: [...historicalDataJson, predictionTicket],
index_column: "TICKET_ID",
};
// Call SAP-RPT-1 API
const response = await fetch('https://rpt.cloud.sap/api/predict', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Authorization': `Bearer ${process.env.SAP_RPT_API_KEY}`,
},
body: JSON.stringify(body)
});
if (!response.ok) {
throw new Error(`SAP-RPT API error: ${response.status} ${response.statusText}`);
}
const data = await response.json();
const prediction = data.prediction.predictions[0];
// Map RPT-1 response to our schema
return {
description: input.description,
sapModule: prediction.SAP_MODULE[0].prediction,
priority: prediction.PRIORITY[0].prediction,
needEscalation: prediction.ESCALATION[0].prediction === "1.0" ? true : false
};
} catch (error) {
console.error('Error calling SAP-RPT API:', error);
throw new Error(`Failed to get prediction from SAP-RPT: ${error instanceof Error ? error.message : 'Unknown error'}`);
}
});
// ===== MAIN AGENT FLOW =====
export const serviceTicketAgent = ai.defineFlow(
{
name: 'ServiceTicketAgent',
inputSchema: TicketInputSchema,
outputSchema: TicketSchema,
},
async (input) => {
const prompt = `Act as Service Desk expert. Analyze ticket description: ${input.description}.
Predict and return JSON with:
- SAP Module (FI, CO, MM, SD, etc.)
- Priority (Critical, High, Medium, Low)
- requiresEscalation: true if downtime, data loss, security risk, or critical response needed
Use historical patterns for consistent routing decisions.`;
// Genkit automatically decides when to call the tool
const { output } = await ai.generate({
prompt,
output: { schema: TicketSchema },
tools: [predictTicketWithSAPRPT1]
});
if (!output) throw new Error('Failed to predict');
return output;
},
);
async function main() {
}
main().catch(console.error);my-genkit-app/
├── src/
│ ├── index.ts # Main code
│ └── history.json # Historical tickets
├── .env # API keys (don't commit!)
├── .gitignore # Add .env here
├── package.json
└── tsconfig.jsonAttempt Time Input Tokens Cost
| 1: Naive LLM | ~3s | 150 |
| 2: LLM + History | 12.7s | 22,993 |
| 3: LLM + RPT-1 Tool | 9.3s | 639 |
Winner: Attempt 3 - 36x more token-efficient than Attempt 2
Use This Architecture When:
Don't Use This For:
Found this helpful? Share your thoughts in the comments or connect with me to discuss AI integration.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
| User | Count |
|---|---|
| 12 | |
| 7 | |
| 5 | |
| 5 | |
| 5 | |
| 5 | |
| 4 | |
| 4 | |
| 3 | |
| 3 |