Technology Blogs by Members
Explore a vibrant mix of technical expertise, industry insights, and tech buzz in member blogs covering SAP products, technology, and events. Get in the mix!
cancel
Showing results for 
Search instead for 
Did you mean: 
MioYasuatke
Active Contributor

Introduction


Document Information Extraction is a service provided on BTP. It leverages machine learning and  you can upload business documents such as invoice, purchase order to receive extracted information.

The purpose of this blog post is to demonstrate how to integrate Document Information Extraction with UI5 application. We will upload an invoice and get extracted information displayed on the app.

The code is available at GitHub as always.

Application behavior


When you upload an invoice pdf, the app posts the file to Document Information Extraction. Next, press "Refresh" button until the extraction job finishes. Finally, Extracted data will be displayed on the screen.

You can download sample invoices from the following tutorial page.
Use Machine Learning to Extract Information from Documents with Document Information Extraction Tria...



 

Prerequisites for running the application



  • An instance of Document Information Extraction and its service key (you can run booster to create them automatically)

  • Destination pointing to Document Information Extraction API



Destination


 









































Property Value
Name doc-info-extraction
Type HTTP
URL "url" in the service key + /document-information-extraction/v1
Proxy Type Internet
Authentication OAuth2ClientCredentials
Client ID "uaa.clientid" in the service key
Client Secret "uaa.clientsecret" in the service key
Token Service URL "uaa.url" int the service key + /oautn/token


API used for the application


The application uses two API endpoints of Document Information Extraction. You can find API documentation here.
POST /document/jobs

This endpoint is used to upload a file along with options to tell the service what type of document you are going to upload and which fields you want to have back. Instead of passing exact fields, you can also specify a template (you need to define it beforehand). For more information, please refer to the document.

The following screenshot shows a request executed from Postman. For headers Content-Type: multipart/form-data is set.



GET   /document/jobs/{id}

As you see int the picture above, POST request returns id which you can use to retrieve the extraction results. At first the status may be "RUNNING".


After some time (say, 10 seconds), the status will become "DONE" and you will get extraction results.


 

 

UI5 code


The key parts are as follows.

  1. Uploading a file to Document Information Extraction

  2. Retrieving extraction results


I used ts-app (TypeScript) template of generator-ui5.
Please note that the app needs to be deployed to BTP to function.

1. Uploading a file to Document Information Extraction


When you presses "Upload" button, the app will get the uploaded file and post it to /document/jobs endpoint. After successful upload, you will get an id which you'll use later to fetch extraction results.
	public async handleUploadPress(): Promise<void> {
if(this._jobId) {
MessageBox.confirm((this.getResourceBundle() as ResourceBundle).getText("confirmText"), {
onClose: async (oAction: string) => {
if (oAction === "OK") {
this._resetData()
await this._uploadImage()
}
}
})
} else {
await this._uploadImage()
}
}

private async _uploadImage(): Promise<void> {
//prepare form data
const oFileUploader = this.byId("fileUploader") as FileUploader
const oUploadedFile = oFileUploader.oFileUpload.files[0] as File
const blob = new Blob([oUploadedFile], { type: oUploadedFile.type })

const formData = new FormData()
formData.append("file", blob, oUploadedFile.name)

const options = (this.getOwnerComponent().getModel("options") as JSONModel).getData() as Options
formData.append('options', JSON.stringify(options))

//call die
const response = await this._postToDie(formData)
this._jobId = response.id;

// enable refresh button
(this.getView().getModel("viewModel") as JSONModel).setProperty("/refreshEnabled", true)
}

private async _postToDie(formData:FormData): Promise<Response> {
const dieUrl = this._getbaseUrl() + "/document/jobs"
const response = await fetch(dieUrl, {
method: 'POST',
body: formData
})
return response.json()
}

private _getbaseUrl(): string {
const appId = this.getOwnerComponent().getManifestEntry("/sap.app/id")
const appPath = appId.replaceAll(".", "/")
const appModulePath = jQuery.sap.getModulePath(appPath) as string
return appModulePath + "/doc-info-extraction"
}

To post a job to Document Information Extraction, "options" object is required as described in "API used for the application" section. For this sample app, options are configured as below.
{
"clientId": "default",
"extraction": {
"headerFields": [
"documentNumber",
"purchaseOrderNumber",
"documentDate",
"dueDate",
"grossAmount",
"currencyCode"
],
"lineItemFields": [
"description",
"quantity",
"unitOfMeasure",
"unitPrice",
"netAmount"
]
},
"documentType": "invoice"
}

 

2. Retrieving extraction results


When you press "Refresh" button on the screen, the app will try to fetch extraction status from  /document/jobs/{id} endpoint. If it is done, extracted fields will be stored into view model and displayed on the UI.

* In a real-world scenario, it would be preferable to retrieve the results automatically, rather than having the user refresh the page.
	public async onRefresh(): Promise<void> {
const response = await this._getStatus()
if (response.status === "DONE") {
this._setInvoiceData(response.extraction)
const viewModel = this.getView().getModel("viewModel") as JSONModel
viewModel.setProperty("/refreshEnabled", false)
viewModel.setProperty("/editable", true)

} else if (response.status === "PENDING") {
MessageToast.show((this.getResourceBundle() as ResourceBundle).getText("pendingText"))
}
}

private async _getStatus(): Promise<any> {
const dieUrl = this._getbaseUrl() + "/document/jobs" + "/" + this._jobId
const response = await fetch(dieUrl, {
method: 'GET'
})
return response.json()
}

private _setInvoiceData(extractedData: any): void {
const invoice = {}

//set header
const invoiceHeader = (extractedData.headerFields as Item[]).reduce((acc, curr) => {
acc[curr.name] = curr.value
return acc
}, {})

//set items
const invoiceItems = (extractedData.lineItems as Item[][]).reduce((acc, item) => {
const lineItem = item.reduce((acc, curr) => {
acc[curr.name] = curr.value
return acc
} , {})
acc.push(lineItem)
return acc
}, [])

invoice["header"] = invoiceHeader;
invoice["items"] = invoiceItems;

(this.getView().getModel("invoice") as JSONModel).setData(invoice)
}

 

Closing


In this blog post, I have demonstrated how to upload a document to Document Information Extraction service using UI5. I hope this post will help you implement your own scenarios, such as using custom document types.

References


8 Comments
Labels in this area