2024 May 01 7:57 AM - edited 2024 Jun 24 12:58 PM
!!! THIS CHALLENGE IS CLOSED !!!
CHECK OUT WEEK 2 OF THIS CHALLENGE
CHECK OUT WEEK 3 OF THIS CHALLENGE
CHECK OUT WEEK 4 OF THIS CHALLENGE
CHECK OUT WEEK 5 OF THIS CHALLENGE
Welcome to week 1 of the May Developer Challenge on AI at SAP! The topic of this month’s challenge are the SAP AI Services; Document Information Extraction and Data Attribute Recommendation. To participate in the challenge you just have to post a screenshot of your solution as a reply in this discussion of the corresponding week.
SAP AI Services help you implement custom use cases by providing powerful algorithms specifically tailored to business problems.
Document Information Extraction:
Data Attribute Recommendation:
This week you will use the UI of the Document Information Extraction service to extract information from your favorite recipe. The UI is great to try out your use case and get a feeling of the capabilities of the service. For productive use cases you would call the APIs or implement a workflow using the Python SDK. Productively, you could then for example implement a workflow that processes documents right out of your mailbox, saves the extracted information in the system and structure you need as well as triggers other necessary workflows.
For this week’s challenge, use the UI to extract the header fields “recipe name”, “portions” and the line items “quantity” and “ingredient” from your chosen recipe. Therefore, you need to create a custom schema. Make sure the recipe is in one of the supported languages.
When creating a custom Schema chose the Setup Type auto to use the llm/genAI-based Premium Edition. In the description field provide information for the large language model to understand what you are referring to e.g. “the name of the recipe”.
Example Screenshot:
Processing a ©Pokémon Card in 90 seconds with Document Information Extraction powered by generative AI: https://community.sap.com/t5/technology-blogs-by-sap/processing-a-pok%C3%A9mon-card-in-90-seconds-wi...
Be aware of limits that apply in free tier and trial accounts: https://help.sap.com/docs/document-information-extraction/document-information-extraction/free-tier-...
How to improve your results: https://help.sap.com/docs/document-information-extraction/document-information-extraction/best-pract...
In this “2-min of” video I am describing the technical aspects of the BASE service (without use of LLM) behind the scenes.
2024 May 01 4:58 PM
Some positive results:
Some less so:
2024 May 01 5:45 PM
Here my result
Not yet a 100% correct result, but looking forward to the next challenges to learn more about the AI services
2024 May 02 7:57 AM
Read restaurant name and address from image
2024 May 02 4:08 PM - edited 2024 May 02 4:10 PM
2024 May 22 5:05 PM
I got the same: if some ingredient was mentioned earlier in the text, then it would be bound-boxed there, but still matched with the line item listing the same ingredient's quantity.
2024 May 02 5:18 PM
2024 May 02 9:16 PM
My recipe. All good.
2024 May 03 10:55 AM
Interesting to see how it extracts data with the minimum of information:
Some small mistakes like the quantity not separately placed in the quantity field but merged in the ingredients field, even after changing the type of Quantity. Highlights almost fully correct.
2024 May 03 4:02 PM
Hello @noravonthenen!
Thanks for this wonderful content!
I am mesmerised by this tool and the results of it!
2024 May 22 5:00 PM
It is interesting to see how the process translated 6h 5m into `365` value of the total time.
2024 May 03 6:14 PM
2024 May 03 7:08 PM - edited 2024 May 03 9:03 PM
Is it May, Mai or M-AI challenge 😉
Nearly right:
2024 May 06 8:10 AM
@johna69 LOVE the M-AI challenge comment 😄
2024 May 04 3:48 AM
Not bad though, in the first attempt it wasn't able to identify any of the ingredients from the document. But I tried to mark them explicitly (only for the first 5), then on my second run of the same document it tried to map those exact same lines.
2024 May 22 4:57 PM
Out of curiosity, what does your schema look like? I am curious why it is using decimal points in your results, as I do not think I've seen them in anyone else results.
2024 May 04 7:25 AM
2024 May 04 8:46 AM - edited 2024 May 04 9:00 AM
Is this AI service using LayoutLM algorithms? - In the context of a specific business application as e.g. incoming invoices or delivery notes for a single company a large language model is maybe a sort of overkill, since there is so much more specific information available about these documents and these documents are contained in a very small subset of all documents? A lot of specific informatin is not used. What AI model would you suggest in such a case?
2024 May 06 8:09 AM
Hi @thomas_mller13, no this service does not use LayoutLM but there are other algorithms based on layout that are being used. Here is a description of the underlying algorithms of the base edition: this “2-min of” video. The premium edition uses GPT in the background to determine all kinds of other values.
2024 May 07 11:36 AM
Thx
2024 May 04 1:54 PM
Hi,
If the document / image has text in it, then the data is extracting. That is working fine.
Whether the ingredient or quantity wont be determined from the picture which doesn't have any text?
Thanks
2024 May 06 8:05 AM
Hi @Sabarim_07, Yes the service we are using is for extracting text from documents (pdf or images) and identifying what the test is. So title and ingredients in our example. In business context that could be order number or customer or phone number, email and address, line items, total amount or currency and so on. Therefore, feeding only an image without text does not work with this service. What you are suggesting would be an image recognition and object detection task.
2024 May 06 8:00 AM
Hi,
I got nice results. title of the recipe and the ingredients.
2024 May 06 9:40 PM
2024 May 07 1:15 AM
@noravonthenen - Thanks for the challenge.
Here are my results. Pretty awesome. The Qty field did not detect the 1/2 and 1/4 in the image but rest was good.
Tried editing to make the model learn but I believe it was not able to detect this field. Tried converting the Quantity to String then it did not detect the qty but both qty and ingredient was extracted into the ingredient field.
2024 May 07 9:11 AM
Hi @noravonthenen ,
I tired my all time favorite recipe. and successfully able to read all ingredients. I tired to read Steps as well, but in case of Line Items system will get confused.
Can we make sections above line-items just to differentiate data.
2024 May 07 12:18 PM
Hey @noravonthenen thank you for organizing this as this is really great stuff.Most important this is really simple to use i still remember i had to write entire so many lines of python code to get this done. Can wait to see more such simple to use tools.
______________________________________________________
1. Document that is uploaded for DOX(Homemade pizza).
___________________________________________________________
2. Image of result.
___________________________________________________
learnings - Produced best result if the wordings in image is simple and short. As all my items it was able to recognize.
____________________________________________________
- RAHUL1221
2024 May 07 3:13 PM
Hi, Here are my results
2024 May 07 5:03 PM
It worked quite well!
2024 May 07 10:33 PM
Great toolset to use, especially for when more complex tasks are issued. Great documentation and easy to read. Will definitely use this for future projects.
My submission
2024 May 08 10:18 AM
Clearly my recipe didn't have the item, quantity, uom separated. So, everything is taken into ingredient, which is cool.
2024 May 08 12:09 PM
Done!
2024 May 08 4:52 PM
Hi, Here my Result.
Tried my best ,but not get 100% correct result, but looking forward to the next challenges to learn more about the AI services
2024 May 08 9:10 PM
Last but not least 🙂 sorry for the delay! I'm a little bit disillusioned with the results. I tried first for Thermomix book without any results. I've thinked the maybe the engine with spanish couldn't be as much accurated as in english. So I look for a book with recipes in internet. Uploaded in a new schema 4 of them and putting them in a template and the unique improvement that I've seen is that from the second template they learnt the allergens. I'm pretty much sure that with more convencional formats will improve the result a lot. But maybe I expected a little bit more
2024 May 15 8:54 AM
Hi @xavisanse Have you opened the line items and checked the result in there? On the screenshot it looks like it detected the ingredients.
2024 May 09 10:02 PM
My submission for week1.
2024 May 12 11:27 AM
In my case, the same thing happened as with @Alpesa1990 . I have entered a recipe in Spanish and the service doesn't differentiate well between the ingredient and the quantity.
2024 May 13 12:18 PM
I got an almost perfect result!
Only information I was unable to extract was the temperature for preheating the oven, which is at the beginning of the instructions. Maybe because I put it as a header field? Maybe because my description was not complete enough?
Anyway, it is a spectacular tool nonetheless!
2024 May 15 5:00 PM
Is it too late to engage on this challenge? 😊
2024 May 23 11:28 AM
Definitely not too late! Everyone is always welcome!