I guess that the best start will be with some demonstration:
Recently, I attended an AI development training that focused on many aspects of AI integration and development. One of them was a big, yet slightly blurry, topic - how to steer AI model behavior to make it do what you want (and certainly not do what you don't want). As I am primarily a backend and integration developer, I thought it might be interesting to combine this knowledge with some Fiori/CAP training material. And here it is: if you've seen a demo, an integrated chat with native SAPUI5 controls embedded into the launchpad. My goal was to build something fulfilling these criteria::
App/chat must be available on the launchpad's main screen (decided to build a shell plugin).
It must use already available SAPUI5 controls (and it does, with some drawbacks).
It should utilize the integration flow built in CI.
The model must be configured to:
Answer only technical and organizational questions.
Operate based on the provided context.
And, as you can see, it is a moderate success. At least, all criteria are met (the question about translation checks if the model is aware of its role) :). In this blog, I'll describe the general application architecture and prompt engineering. My idea is to create blog series, separate blogs describing each segment of this development: front-end app, middleware (CI), and backend (ABAP).
Simplified architecture you can find in below diagram:
This solution uses CI as its main processing unit, which is responsible for:
Creating system prompts (more about it later).
Fetching context for the GPT model: HANA apps, team structure, and members.
Executing the OpenAI API and parsing the response back to a plugin.
Each call can be described by the following sequence:
Which... is not entirely true, as there are some mocks 🙂 But we'll get to this part later. Basically, this is a very inefficient solution, both in terms of performance and financially. This, of course, can be improved, and I hope it will be the subject of a follow-up posts (for example we can detect earlier what is the purpose of the question and based on that fetch only necessary context).
Prompts are a very important part of GPT model interaction, and prompt engineering is a broad topic (which I only scratch the surface of). Generally, it is the main method to influence the model's behavior developer's can use and the only one used in this case.
The general prompt structure can be found in the above picture (Thanks, aidevs!). Not all of it needs to be used, and I also skip one part. However, this is something that the model is fed before each request (chat question) goes to the API. This part is handled by CI (which works a little like the backend for this solution). As developers, we want to control what the model does and what the user can see. Prompt manipulations should be handled away from the user (in this case, the shell plugin). In CI related blog entry I'll also focus on that.
This is only an introduction and a demonstration. In upcoming blog posts, I'll focus on and explain each element of this simple solution: