Application Development Discussions
Join the discussions or start your own on all things application development, including tools and APIs, programming models, and keeping your skills sharp.
cancel
Showing results for 
Search instead for 
Did you mean: 

May Developer Challenge - SAP AI Services

noravonthenen
Developer Advocate
Developer Advocate

!!! THIS CHALLENGE IS CLOSED !!!

CHECK OUT WEEK 2 OF THIS CHALLENGE

CHECK OUT WEEK 3 OF THIS CHALLENGE

CHECK OUT WEEK 4 OF THIS CHALLENGE

CHECK OUT WEEK 5 OF THIS CHALLENGE

Welcome to week 1 of the May Developer Challenge on AI at SAP! The topic of this month’s challenge are the SAP AI Services; Document Information Extraction and Data Attribute Recommendation. To participate in the challenge you just have to post a screenshot of your solution as a reply in this discussion of the corresponding week.

SAP AI Services help you implement custom use cases by providing powerful algorithms specifically tailored to business problems.

Document Information Extraction:

  • The Document Information Extraction service is available in two editions, the original Base Edition and the new genAI-based Premium Edition. The genAI-based Premium edition is using a large language model via generative AI hub on SAP AI Core to extract information from all kinds of documents.
  • With Document Information Extraction you can extract information from the file types PDF or single page JPEG, PNG and TIFF.  
  • Supported document types are: invoice, paymentAdvice, purchaseOrder, businessCard, deliveryNote, resume and birthCertificate. You can also create your own schema to process other document types.
  • You can also extract OCR results directly to process the raw text from you document files as well as use the classification capabilities to classify your documents into the three classes: invoice, purchase order and payment advice.
  • You can also enrich your extracted data with your metadata.
  • You can access the Document Information Extraction service via the UI, via swagger/client calls and the Python SDK.

Data Attribute Recommendation:

  • With Data Attribute Recommendation you can train your own model to classify data records, you can also tackle more complex classification problems such as hierarchical classification of products and predict missing data records
  • Data Attribute Recommendation can be used via swagger/client calls as well as the AI API Python SDK and SAP AI Launchpad
  • If you want to access Data Attribute Recommendations via Postman you can download this Postman Collection

Weekly Challenges

Week 1 Challenge – DOX UI

This week you will use the UI of the Document Information Extraction service to extract information from your favorite recipe. The UI is great to try out your use case and get a feeling of the capabilities of the service. For productive use cases you would call the APIs or implement a workflow using the Python SDK. Productively, you could then for example implement a workflow that processes documents right out of your mailbox, saves the extracted information in the system and structure you need as well as triggers other necessary workflows.

For this week’s challenge, use the UI to extract the header fields “recipe name”, “portions” and the line items “quantity” and “ingredient” from your chosen recipe. Therefore, you need to create a custom schema. Make sure the recipe is in one of the supported languages.

When creating a custom Schema chose the Setup Type auto to use the llm/genAI-based Premium Edition. In the description field provide information for the large language model to understand what you are referring to e.g. “the name of the recipe”.

noravonthenen_0-1714546116599.png

  1. Get a free trial account and run DOX booster: https://developers.sap.com/tutorials/cp-aibus-dox-booster-key.html
  2. Get the Document Information Extraction UI: https://developers.sap.com/tutorials/cp-aibus-dox-ui-sub.html
  3. Create a custom schema: https://developers.sap.com/tutorials/cp-aibus-dox-ui-gen-ai.html
  4. OPTIONAL: Create a template and add your document to the template (improves performance for future recipes)
  5. Upload your favorite recipe to extract the name, portions, quantity and ingredients. Make sure your recipe pdf is only 1 or 2 pages long, otherwise you will quickly reach the limit (50 pages) of the trial plan. And try not to use the entire 50 page quota because we will need it next week as well!
  6. Submission: share a screenshot of the extraction results and the document and write a comment to share your experience using the UI in the discussion below.

Example Screenshot:

noravonthenen_1-1714546116619.png

Additional information:

Processing a ©Pokémon Card in 90 seconds with Document Information Extraction powered by generative AI: https://community.sap.com/t5/technology-blogs-by-sap/processing-a-pok%C3%A9mon-card-in-90-seconds-wi...

Be aware of limits that apply in free tier and trial accounts: https://help.sap.com/docs/document-information-extraction/document-information-extraction/free-tier-...

How to improve your results: https://help.sap.com/docs/document-information-extraction/document-information-extraction/best-pract...

In this “2-min of” video I am describing the technical aspects of the BASE service (without use of LLM) behind the scenes.

48 REPLIES 48

geek
Participant

Some positive results:

geek_0-1714578966010.png

Some less so:

geek_1-1714579036201.png

PieterB
Explorer

Here my result

PieterB_0-1714581860857.png

Not yet a 100% correct result, but looking forward to the next challenges to learn more about the AI services

satya-dev
Participant

Read restaurant name and address from image

satyadev_0-1714632948515.png

 

M-K
Explorer

Here is my result:

recipe.jpg

Interestingly some of the ingredients were highlighted in the instructional text and not in the list, however they were all correct.

 

Vitaliy-R
Developer Advocate
Developer Advocate
0 Kudos

I got the same: if some ingredient was mentioned earlier in the text, then it would be bound-boxed there, but still matched with the line item listing the same ingredient's quantity.

Alpesa1990
Participant

My submission.

Alpesa1990_0-1714666469893.png

In my case (Spanish language), the IA doesn´t could separate the quantity and the ingredients... But it´s so close...

 

IanStubbings
Active Participant

jasperdebie
Explorer

Interesting to see how it extracts data with the minimum of information:

jasperdebie_0-1714729782802.png

Some small mistakes like the quantity not separately placed in the quantity field but merged in the ingredients field, even after changing the type of Quantity. Highlights almost fully correct.

Ruthiel
Product and Topic Expert
Product and Topic Expert

Hello @noravonthenen!

Thanks for this wonderful content!

I am mesmerised by this tool and the results of it!

Ruthiel_0-1714748181676.png

  • The unit on the time-related field surprised me since I had minutes and hours in the recipe however, all the units were correctly inserted in minutes.
  • I could distinguish the main ingredients and the quantities independently of the unit of measure for each line item!

Vitaliy-R
Developer Advocate
Developer Advocate
0 Kudos

It is interesting to see how the process translated 6h 5m into `365` value of the total time.

johna69
Product and Topic Expert
Product and Topic Expert

Is it May, Mai or M-AI challenge 😉

Nearly right:

 

Screenshot 2024-05-03 at 2.07.03 PM.png

@johna69 LOVE the M-AI challenge comment 😄 

narendran_nv
Explorer

Not bad though, in the first attempt it wasn't able to identify any of the ingredients from the document. But I tried to mark them explicitly (only for the first 5), then on my second run of the same document it tried to map those exact same lines.

narendran_nv_0-1714790861585.png

 

0 Kudos

Out of curiosity, what does your schema look like? I am curious why it is using decimal points in your results, as I do not think I've seen them in anyone else results.

gphadnis2000
Participant

Interesting how Document Information extraction reads data with minimum efforts.

gphadnis2000_0-1714803776196.png

 

thomas_mller13
Participant
0 Kudos

Is this AI service using LayoutLM algorithms? - In the context of a specific business application as e.g. incoming invoices or delivery notes for a single company a large language model is maybe a sort of overkill, since there is so much more specific information available about these documents and these documents are contained in a very small subset of all documents? A lot of specific informatin is not used. What AI model would you suggest in such a case? 

Hi @thomas_mller13, no this service does not use LayoutLM but there are other algorithms based on layout that are being used. Here is a description of the underlying algorithms of the base edition: this “2-min of” video. The premium edition uses GPT in the background to determine all kinds of other values.

Sabarim_07
Explorer
0 Kudos

Hi, 

If the document / image has text in it, then the data is extracting. That is working fine. 
Whether the ingredient or quantity wont be determined from the picture which doesn't have any text? 
full-sliced-fruits-bread-high-quality.jpg

Thanks

Hi @Sabarim_07, Yes the service we are using is for extracting text from documents (pdf or images) and identifying what the test is. So title and ingredients in our example. In business context that could be order number or customer or phone number, email and address, line items, total amount or currency and so on. Therefore, feeding only an image without text does not work with this service. What you are suggesting would be an image recognition and object detection task. 

moh_ali_square
Participant

Hi, 

I got nice results. title of the recipe and the ingredients.

SAP_GEN_AI_DOX_food_recipe.png

Venkat_Vyza
Active Participant

Thank you for starting this AI Challenge  @noravonthenen 

 

Document Extraction.png

Nagarajan-K
Explorer

@noravonthenen - Thanks for the challenge.

Here are my results. Pretty awesome. The Qty field did not detect the 1/2 and 1/4 in the image but rest was good.

NagarajanK_0-1715040707234.png

Tried editing to make the model learn but I believe it was not able to detect this field. Tried converting the Quantity to String then it did not detect the qty but both qty and ingredient was extracted into the ingredient field. 

NagarajanK_1-1715040906621.png

 

 

Hira
Explorer

Hi @noravonthenen ,

I tired my all time favorite recipe. and successfully able to read all ingredients. I tired to read Steps as well, but in case of Line Items system will get confused. 

Can we make sections above line-items just to differentiate data. 

Hira_0-1715069392153.png

 

RAHUL1221
Explorer

Hey @noravonthenen thank you for organizing this as this is really great stuff.Most important this is really simple to use i still remember i had to write entire so many lines of python code to get this done. Can wait to see more such simple to use tools.

______________________________________________________
1. Document that is uploaded for DOX(Homemade pizza).

Homemade-Pizza-Ingredients.jpg

 ___________________________________________________________
2. Image of result.

RAHUL1221_0-1715080542029.png

___________________________________________________
learnings - Produced best result if the wordings in image is simple and short. As all my items it was able to recognize.

____________________________________________________
- RAHUL1221

sainithesh21
Active Participant

JoshuaLaw
Explorer

CameronWilson
Explorer

Great toolset to use, especially for when more complex tasks are issued. Great documentation and easy to read. Will definitely use this for future projects.

My submission 

Cameron Wilson May Generative AI Recipe.png

Bharathi_K
Discoverer

Clearly my recipe didn't have the item, quantity, uom separated. So, everything is taken into ingredient, which is cool.

Bharathi_K_0-1715159839562.png

 

Salma_M
Newcomer

Hi, Here my Result.

Salma_M_0-1715183497707.png

Tried my best ,but not get 100% correct result, but looking forward to the next challenges to learn more about the AI services

 

 

xavisanse
Active Participant

Last but not least 🙂 sorry for the delay! I'm a little bit disillusioned with the results. I tried first for Thermomix book without any results. I've thinked the maybe the engine with spanish couldn't be as much accurated as in english. So I look for a book with recipes in internet. Uploaded in a new schema 4 of them and putting them in a template and the unique improvement that I've seen is that from the second template they learnt the allergens. I'm pretty much sure that with more convencional formats will improve the result a lot.  But maybe I expected a little bit more

xavisanse_0-1715198769541.png

 

0 Kudos

Hi @xavisanse  Have you opened the line items and checked the result in there? On the screenshot it looks like it detected the ingredients.

MioYasutake
Active Contributor

emiliocampo
Explorer

In my case, the same thing happened as with @Alpesa1990 . I have entered a recipe in Spanish and the service doesn't differentiate well between the ingredient and the quantity.

emiliocampo_0-1715509586435.png

 

martaseq
Associate
Associate

I got an almost perfect result!

DocInfoExtraction-BrownieRecipe.png

Only information I was unable to extract was the temperature for preheating the oven, which is at the beginning of the instructions. Maybe because I put it as a header field? Maybe because my description was not complete enough?

Anyway, it is a spectacular tool nonetheless!

acmebcn
Participant

Is it too late to engage on this challenge? 😊

 

0 Kudos

Definitely not too late! Everyone is always welcome!