Earlier this year, the “Scene Text” service was made available in AI Business Services and was part of the Q2 AI in Business Technology Platform (BTP) release highlights. Finally got some time to explore how the service works. The blog post below serves as a quick guide to those wishing to set this up on their BTP sub account. The developer guide for this service is not out yet, so there are a few parts that took me some trial and error to figure out. So if you are out exploring today, I hope this saves you that time. The material I found handy is referenced at the very bottom of the post.
SAP AI Business Services provides pre-trained machine learning models tailored for business scenarios. The Document Information Extraction service, or DOX as it is typically referred to, uses machine learning for document processing for a wide range of document types. The SAP AI Business Services have had Large Language Models (LLMs) under the hood for many years, long before the term was all the rage it is now. Last month we announced the DOX Premium Edition, which includes the latest in Generative AI to jump past the need for annotations and even training.
In Q2 this year, we included the functionality to extract text not just from documents like PDFs, but also images. The set up for this is similar to other DOX models, as in tutorial here. The key difference being, that depending on whether the text you wish to extract is in an image or not, you can choose between two types of OCR engines - “Document” or “Scene Text”.
i) Extracting Container Seal IDs on freight containers
ii) Extracting number plates from vehicles (detailed blog from a previous use case where a custom model was set up to work in collaboration with SAP Yard Logistics)
iii) Extracting digital meter readings for Utilities (detailed blog from a previous use case where a custom model was set up to work in collaboration with SAP S/4 HANA Utilities)
DOX Application / Subscription
DOX Instance
Note: I found this a little confusing when I first started out, although to the initiated it may seem obvious. When you set up the Entitlements for Document Information Extraction, you will find the following service plans. You can set up Scene text only with the blocks_of_100 instance. For this blog post, I set up the application as well as I use the DOX UI application, but you can do without it if you prefer doing this entirely with API calls.
# | Name | Type | Description |
1 | default | Instance | Service plan intended for personal exploration |
2 | blocks_of_100 | Instance | Service plan intended for productive usage |
3 | default (Application) | Application | Service plan intended for GUI based usage |
Select Schema Configuration
Select instance
Create Schema
Enter Schema Details
Activate Schema
Create Template
Enter details for your template, linking the schema you created in the previous step.
Enter template details
Click activate to start using the template.
Activate template
Add document
Select image
Confirm image
Image ready (see use cases section for a closer view of picture)
<URL from BTP service key> + '/document-information-extraction/v1/document/jobs/' + <job ID per the DOX UI Application> + '/pages/text'
Set up Authorisation
Call response
Blog: What’s new - AI in BTP Q2
Help Doc: DOX Set up with Schema
Tutorial: Extract fields from documents
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
User | Count |
---|---|
32 | |
13 | |
11 | |
10 | |
9 | |
9 | |
9 | |
9 | |
8 | |
8 |