Technology Blogs by SAP
Learn how to extend and personalize SAP applications. Follow the SAP technology blog for insights into SAP BTP, ABAP, SAP Analytics Cloud, SAP HANA, and more.
Showing results for 
Search instead for 
Did you mean: 
CAP (Cloud Application Programming) has been the de-facto standard for cloud-native development on BTP. Over the years, it has grown in stability, functionality and usability. With just a few annotations, an entire Fiori Elements app is generated from a data model. With different search annotations, a global search can search for particular keywords in any structured database field.
However, sometimes it may help search in unstructured data, e.g. images or other attached documents.



You have a data model of some master data and some hierarchical structure to other entities, e.g.:

  • AidRequest

    • CompanyDetails

    • FinancialData

    • Attachments

You have already a CAP project with a data scheme and a data service, e.g.:

  • db/

    • schema.cds

  • srv/

    • MainService.cds

Your attachment entity uses either HANA as data storage or the Object Store services (like in this blog post explained:


Technical Architecture

You can customise your architecture as you need it. Scanning a document with OCR is a task of unknown duration. If you expect large documents, you may want to use batch jobs running in low-traffic times to ensure operational excellence. If you assume smaller documents or need search capabilities as quick as possible, event hooks may seem best.

In the following, a job-based architecture is used. A dedicated OData Function endpoint retrieves unscanned documents and starts a scan.

This text scan is performed with SAP Document Information Extraction (, short DOX. DOX uses OCR to scan a document and retrieve text.

This result text is stored in a shadow table (let's call it AttachmentText, a 1:1 relationship with the Attachment entity). With the new DOX service, you have plenty of options:

This blog post will use the basic functionality of storing the entire text.


Service Setup

Luckily, SAP BTP provides you with a booster to jumpstart the development. Just trigger the booster for DOX, and you'll automatically see the entitlements, subscription and service instances. In addition, you can now bind the DOX service instance to your CAP service.


Creating your endpoint

As earlier mentioned, this blog post uses a job-based architecture. However, if you prefer an event-based architecture, you may want to add the following code fragments into an AFTER handler of the Attachment CREATE event.

For my code, I have used the following npm packages:

  • cfenv

  • axios

  • form-data

  • aws-sdk

To encapsulate the S3 data access, an own util class looks as follows:
module.exports = class S3ServiceAccess {
constructor(objectStoreCredentials) {
const AWS = require('aws-sdk');

const credentials = new AWS.Credentials(
region: objectStoreCredentials.region,
this._s3 = new AWS.S3({
credentials: credentials,
apiVersion: '2006-03-01',
s3ForcePathStyle: true
this._bucket = objectStoreCredentials.bucket;

getObjectAsBuffer(objectKey) {
const params = {
Bucket: this._bucket,
Key: objectKey

return this._s3.getObject(params).promise().then((data) => data.Body);



With this in the Service CDS, an action can be configured, like:
service EnrichmentService {
function scanAttachmentText(batchSize: Integer) returns Boolean;


The "scanAttachmentText" method is as follows:
srv.on('scanAttachmentText', async (req) => {
const attachments = await getAttachmentsToScan(req);
const limit = || 5;
if (attachments && attachments.length > 0) {
for (let i = 0; i < Math.min(attachments.length, limit); i++) {
setTimeout(() => {
callOCR(attachments[i], persistOCROutput);
}, i * 10 * 1000);
return true;


In the next step, you create your function to retrieve the attachment (rows), which are unscanned yet, in the function "getAttachmentsToScan".

The most exciting part is the callOCR function. Before we create this, you need the following definitions in your class:
const cfenv = require('cfenv');
const mlService = cfenv.getAppEnv().getService('default_sap-document-information-extraction');

const axios = require('axios').default;
const FormData = require('form-data');

// AWS Object Store specific
const S3ServiceAccess = require('./util/S3ServiceAccess');
const objectstore = cfenv.getAppEnv().getService('sf-objectstore');
const objectstoreCredentials = objectstore.credentials;
const S3Service = new S3ServiceAccess(objectstoreCredentials);

Then you can write your callOCR function:

  1. Get a token to call the API (fetchDestinationServiceToken)

  2. Retrieve the file as buffer (S3Service.getObjectAsBuffer)

  3. Start a job in DOX

  4. Wait for completion and retrieve results (be aware DOX has a callback function, the following code snippets do NOT represent best practice and are used for simplistic setups. Please use the callback architecture).

  5. Persist results

This can look as follows:
async function callOCR(options, bpItem, cb) {
const destinationServiceToken = await fetchDestinationServiceToken();
try {
const fileStream = await S3Service.getObjectAsBuffer(bpItem.AttachmentId);
var options = {
'method': 'POST',
'url': mlService.credentials.endpoints.backend.url + '/document-information-extraction/v1/document/jobs',
'headers': {
'Authorization': 'Bearer ' + destinationServiceToken
formData: {
'options': `{
"extraction": {
"headerFields": [
"lineItemFields": [
"clientId": "default",
"documentType": "invoice",
"receivedDate": "2020-02-17",
"enrichment": {
"sender": {
"top": 5,
"type": "businessEntity",
"subtype": "supplier"
"employee": {
"type": "employee"
'file': {
'value': fileStream,
'options': {
'filename': bpItem.OriginalFileName,
'contentType': null

request(options, function (error, response) {
if (error) throw new Error(error);
const jobId = JSON.parse(response.body).id;

// wait for final job outcome, 70 seconds
setTimeout(() => {
'method': 'GET',
'url': mlService.credentials.endpoints.backend.url + '/document-information-extraction/v1/document/jobs/' + jobId + '/pages/text',
'headers': {
'Authorization': 'Bearer ' + destinationServiceToken
}, function (error, response) {
if (error) throw new Error(error);
const jsonOutput = JSON.parse(response.body);
try {
const ocrStrings = Object.values(jsonOutput.results).map(page => => word_boxes.word_boxes)).flat(2).map(item => item.content).join(" ");
cb(ocrStrings, bpItem, fileStream);
catch (error) {
console.log("error for file: ", bpItem.AttachmentId);
cb("", bpItem, fileStream);

}, 70 * 1000);
} catch (error) {
// TODO Handle Error


async function fetchDestinationServiceToken() {
var destinationServiceToken = "";

try {
const uaaServiceResponse = await axios(getDestinationServiceTokenOptions());
if (uaaServiceResponse.status === 200) {
destinationServiceToken =;
} catch (uaaServiceError) {
// TODO Handle Error
console.log('uaaServiceError', uaaServiceError.message);

return destinationServiceToken;

function getDestinationServiceTokenOptions() {
const uaaDestinationCredentials = mlService.credentials.uaa.clientid + ':' + mlService.credentials.uaa.clientsecret;
var destinationServiceTokenFormData = new FormData();
var destinationServiceTokenHeaders = destinationServiceTokenFormData.getHeaders();
destinationServiceTokenHeaders['Authorization'] = 'Basic ' + Buffer.from(uaaDestinationCredentials).toString('base64');
destinationServiceTokenHeaders['Content-type'] = 'application/x-www-form-urlencoded';

const destinationServiceTokenOptions = {
method: 'POST',
url: mlService.credentials.uaa.url + '/oauth/token',
headers: destinationServiceTokenHeaders,
params: { client_id: mlService.credentials.uaa.clientid, grant_type: 'client_credentials' }

return destinationServiceTokenOptions;


Annotate searchable fields

Now you can use the results of the OCR scan to persist them in the database. Now you can annotate this field as searchable (


I hope you enjoyed this blog post and you made your CAP application more powerful.