Application Development and Automation Discussions
Join the discussions or start your own on all things application development, including tools and APIs, programming models, and keeping your skills sharp.
cancel
Showing results for 
Search instead for 
Did you mean: 
Read only

May Developer Challenge - SAP AI Services

noravonthenen
Developer Advocate
Developer Advocate
13,217

!!! THIS CHALLENGE IS CLOSED !!!

CHECK OUT WEEK 2 OF THIS CHALLENGE

CHECK OUT WEEK 3 OF THIS CHALLENGE

CHECK OUT WEEK 4 OF THIS CHALLENGE

CHECK OUT WEEK 5 OF THIS CHALLENGE

Welcome to week 1 of the May Developer Challenge on AI at SAP! The topic of this month’s challenge are the SAP AI Services; Document Information Extraction and Data Attribute Recommendation. To participate in the challenge you just have to post a screenshot of your solution as a reply in this discussion of the corresponding week.

SAP AI Services help you implement custom use cases by providing powerful algorithms specifically tailored to business problems.

Document Information Extraction:

  • The Document Information Extraction service is available in two editions, the original Base Edition and the new genAI-based Premium Edition. The genAI-based Premium edition is using a large language model via generative AI hub on SAP AI Core to extract information from all kinds of documents.
  • With Document Information Extraction you can extract information from the file types PDF or single page JPEG, PNG and TIFF.  
  • Supported document types are: invoice, paymentAdvice, purchaseOrder, businessCard, deliveryNote, resume and birthCertificate. You can also create your own schema to process other document types.
  • You can also extract OCR results directly to process the raw text from you document files as well as use the classification capabilities to classify your documents into the three classes: invoice, purchase order and payment advice.
  • You can also enrich your extracted data with your metadata.
  • You can access the Document Information Extraction service via the UI, via swagger/client calls and the Python SDK.

Data Attribute Recommendation:

  • With Data Attribute Recommendation you can train your own model to classify data records, you can also tackle more complex classification problems such as hierarchical classification of products and predict missing data records
  • Data Attribute Recommendation can be used via swagger/client calls as well as the AI API Python SDK and SAP AI Launchpad
  • If you want to access Data Attribute Recommendations via Postman you can download this Postman Collection

Weekly Challenges

Week 1 Challenge – DOX UI

This week you will use the UI of the Document Information Extraction service to extract information from your favorite recipe. The UI is great to try out your use case and get a feeling of the capabilities of the service. For productive use cases you would call the APIs or implement a workflow using the Python SDK. Productively, you could then for example implement a workflow that processes documents right out of your mailbox, saves the extracted information in the system and structure you need as well as triggers other necessary workflows.

For this week’s challenge, use the UI to extract the header fields “recipe name”, “portions” and the line items “quantity” and “ingredient” from your chosen recipe. Therefore, you need to create a custom schema. Make sure the recipe is in one of the supported languages.

When creating a custom Schema chose the Setup Type auto to use the llm/genAI-based Premium Edition. In the description field provide information for the large language model to understand what you are referring to e.g. “the name of the recipe”.

noravonthenen_0-1714546116599.png

  1. Get a free trial account and run DOX booster: https://developers.sap.com/tutorials/cp-aibus-dox-booster-key.html
  2. Get the Document Information Extraction UI: https://developers.sap.com/tutorials/cp-aibus-dox-ui-sub.html
  3. Create a custom schema: https://developers.sap.com/tutorials/cp-aibus-dox-ui-gen-ai.html
  4. OPTIONAL: Create a template and add your document to the template (improves performance for future recipes)
  5. Upload your favorite recipe to extract the name, portions, quantity and ingredients. Make sure your recipe pdf is only 1 or 2 pages long, otherwise you will quickly reach the limit (50 pages) of the trial plan. And try not to use the entire 50 page quota because we will need it next week as well!
  6. Submission: share a screenshot of the extraction results and the document and write a comment to share your experience using the UI in the discussion below.

Example Screenshot:

noravonthenen_1-1714546116619.png

Additional information:

Processing a ©Pokémon Card in 90 seconds with Document Information Extraction powered by generative AI: https://community.sap.com/t5/technology-blogs-by-sap/processing-a-pok%C3%A9mon-card-in-90-seconds-wi...

Be aware of limits that apply in free tier and trial accounts: https://help.sap.com/docs/document-information-extraction/document-information-extraction/free-tier-...

How to improve your results: https://help.sap.com/docs/document-information-extraction/document-information-extraction/best-pract...

In this “2-min of” video I am describing the technical aspects of the BASE service (without use of LLM) behind the scenes.

48 REPLIES 48
Read only

geek
Participant
12,010

Some positive results:

geek_0-1714578966010.png

Some less so:

geek_1-1714579036201.png

Read only

PieterB
Participant
12,003

Here my result

PieterB_0-1714581860857.png

Not yet a 100% correct result, but looking forward to the next challenges to learn more about the AI services

Read only

satya-dev
Participant
11,893

Read restaurant name and address from image

satyadev_0-1714632948515.png

 

Read only

M-K
Active Participant
11,816

Here is my result:

recipe.jpg

Interestingly some of the ingredients were highlighted in the instructional text and not in the list, however they were all correct.

 

Read only

Vitaliy-R
Developer Advocate
Developer Advocate
0 Kudos
7,184

I got the same: if some ingredient was mentioned earlier in the text, then it would be bound-boxed there, but still matched with the line item listing the same ingredient's quantity.

Read only

Alpesa1990
Participant
11,778

My submission.

Alpesa1990_0-1714666469893.png

In my case (Spanish language), the IA doesn´t could separate the quantity and the ingredients... But it´s so close...

 

Read only

IanStubbings
Active Participant
11,680

My recipe. All good.

 

genai-dev-challenge.png

Read only

jasperdebie
Explorer
11,549

Interesting to see how it extracts data with the minimum of information:

jasperdebie_0-1714729782802.png

Some small mistakes like the quantity not separately placed in the quantity field but merged in the ingredients field, even after changing the type of Quantity. Highlights almost fully correct.

Read only

Ruthiel
Product and Topic Expert
Product and Topic Expert
11,504

Hello @noravonthenen!

Thanks for this wonderful content!

I am mesmerised by this tool and the results of it!

Ruthiel_0-1714748181676.png

  • The unit on the time-related field surprised me since I had minutes and hours in the recipe however, all the units were correctly inserted in minutes.
  • I could distinguish the main ingredients and the quantities independently of the unit of measure for each line item!
Read only

Vitaliy-R
Developer Advocate
Developer Advocate
0 Kudos
7,245

It is interesting to see how the process translated 6h 5m into `365` value of the total time.

Read only

johna69
Product and Topic Expert
Product and Topic Expert
11,429

Is it May, Mai or M-AI challenge 😉

Nearly right:

 

Screenshot 2024-05-03 at 2.07.03 PM.png

Read only

10,862

@johna69 LOVE the M-AI challenge comment 😄 

Read only

narendran_nv
Explorer
11,290

Not bad though, in the first attempt it wasn't able to identify any of the ingredients from the document. But I tried to mark them explicitly (only for the first 5), then on my second run of the same document it tried to map those exact same lines.

narendran_nv_0-1714790861585.png

 

Read only

0 Kudos
7,288

Out of curiosity, what does your schema look like? I am curious why it is using decimal points in your results, as I do not think I've seen them in anyone else results.

Read only

gphadnis2000
Participant
11,243

Interesting how Document Information extraction reads data with minimum efforts.

gphadnis2000_0-1714803776196.png

 

Read only

thomas_mller13
Participant
0 Kudos
11,222

Is this AI service using LayoutLM algorithms? - In the context of a specific business application as e.g. incoming invoices or delivery notes for a single company a large language model is maybe a sort of overkill, since there is so much more specific information available about these documents and these documents are contained in a very small subset of all documents? A lot of specific informatin is not used. What AI model would you suggest in such a case? 

Read only

10,863

Hi @thomas_mller13, no this service does not use LayoutLM but there are other algorithms based on layout that are being used. Here is a description of the underlying algorithms of the base edition: this “2-min of” video. The premium edition uses GPT in the background to determine all kinds of other values.

Read only

Read only

Sabarinathan_m
Explorer
0 Kudos
11,160

Hi, 

If the document / image has text in it, then the data is extracting. That is working fine. 
Whether the ingredient or quantity wont be determined from the picture which doesn't have any text? 
full-sliced-fruits-bread-high-quality.jpg

Thanks

Read only

10,886

Hi @Sabarinathan_m, Yes the service we are using is for extracting text from documents (pdf or images) and identifying what the test is. So title and ingredients in our example. In business context that could be order number or customer or phone number, email and address, line items, total amount or currency and so on. Therefore, feeding only an image without text does not work with this service. What you are suggesting would be an image recognition and object detection task. 

Read only

moh_ali_square
Participant
10,894

Hi, 

I got nice results. title of the recipe and the ingredients.

SAP_GEN_AI_DOX_food_recipe.png

Read only

Venkat_Vyza
Active Participant
10,649

Thank you for starting this AI Challenge  @noravonthenen 

 

Document Extraction.png

Read only

Nagarajan-K
Explorer
10,601

@noravonthenen - Thanks for the challenge.

Here are my results. Pretty awesome. The Qty field did not detect the 1/2 and 1/4 in the image but rest was good.

NagarajanK_0-1715040707234.png

Tried editing to make the model learn but I believe it was not able to detect this field. Tried converting the Quantity to String then it did not detect the qty but both qty and ingredient was extracted into the ingredient field. 

NagarajanK_1-1715040906621.png

 

 

Read only

Hira
Participant
10,423

Hi @noravonthenen ,

I tired my all time favorite recipe. and successfully able to read all ingredients. I tired to read Steps as well, but in case of Line Items system will get confused. 

Can we make sections above line-items just to differentiate data. 

Hira_0-1715069392153.png

 

Read only

RAHUL1221
Explorer
10,321

Hey @noravonthenen thank you for organizing this as this is really great stuff.Most important this is really simple to use i still remember i had to write entire so many lines of python code to get this done. Can wait to see more such simple to use tools.

______________________________________________________
1. Document that is uploaded for DOX(Homemade pizza).

Homemade-Pizza-Ingredients.jpg

 ___________________________________________________________
2. Image of result.

RAHUL1221_0-1715080542029.png

___________________________________________________
learnings - Produced best result if the wordings in image is simple and short. As all my items it was able to recognize.

____________________________________________________
- RAHUL1221

Read only

Sai_Nithesh_G
Active Participant
10,232

Hi, Here are my results

sainithesh21_0-1715091167526.png

 

Read only

JoshuaLaw
Explorer
10,184

It worked quite well! Screenshot 2024-05-07 180106.png

 

Read only

CameronWilson
Explorer
10,091

Great toolset to use, especially for when more complex tasks are issued. Great documentation and easy to read. Will definitely use this for future projects.

My submission 

Cameron Wilson May Generative AI Recipe.png

Read only

Bharathi_K
Explorer
9,909

Clearly my recipe didn't have the item, quantity, uom separated. So, everything is taken into ingredient, which is cool.

Bharathi_K_0-1715159839562.png

 

Read only

Jordi_C
Explorer
Read only

Salma_M
Newcomer
9,744

Hi, Here my Result.

Salma_M_0-1715183497707.png

Tried my best ,but not get 100% correct result, but looking forward to the next challenges to learn more about the AI services

 

 

Read only

xavisanse
Active Participant
9,655

Last but not least 🙂 sorry for the delay! I'm a little bit disillusioned with the results. I tried first for Thermomix book without any results. I've thinked the maybe the engine with spanish couldn't be as much accurated as in english. So I look for a book with recipes in internet. Uploaded in a new schema 4 of them and putting them in a template and the unique improvement that I've seen is that from the second template they learnt the allergens. I'm pretty much sure that with more convencional formats will improve the result a lot.  But maybe I expected a little bit more

xavisanse_0-1715198769541.png

 

Read only

0 Kudos
7,950

Hi @xavisanse  Have you opened the line items and checked the result in there? On the screenshot it looks like it detected the ingredients.

Read only

MioYasutake
Active Contributor
9,412

My submission for week1.

MioYasutake_0-1715288504450.png

 

Read only

emiliocampo
Explorer
8,345

In my case, the same thing happened as with @Alpesa1990 . I have entered a recipe in Spanish and the service doesn't differentiate well between the ingredient and the quantity.

emiliocampo_0-1715509586435.png

 

Read only

martaseq
Associate
Associate
8,191

I got an almost perfect result!

DocInfoExtraction-BrownieRecipe.png

Only information I was unable to extract was the temperature for preheating the oven, which is at the beginning of the instructions. Maybe because I put it as a header field? Maybe because my description was not complete enough?

Anyway, it is a spectacular tool nonetheless!

Read only

acmebcn
Participant
7,866

Is it too late to engage on this challenge? 😊

 

Read only

0 Kudos
7,117

Definitely not too late! Everyone is always welcome!