SAP Process Automation: Multiple pages Invoice Data Extraction (Document Template (Training a New Model)) and populate extracted data in Excel Sheet
Introduction: -
In this Blog Post, we will see how SAP Intelligent RPA can help to simplify Invoice data Extraction in some simple steps and get those Extracted Data in an Excel Sheet. I will Extract Data from a PDF that has multiple pages, even though we can use various PDF files in a folder. For this demo, I used a PDF file, which has multiple pages. We will also see, how to split Invoice pages of a PDF and How to Populate Extracted Invoice data in an Excel Sheet.
We need to follow the below steps to create a bot
- Open "Intelligent Robotic Cloud Automation Factory” then click on " Project ", Enter the project name, and create.
Automation Process
- Under the "Artifact" click on the plus icon then click on create then select "Automation" and supply a name for automation.
- Click on the "Setting" icon (Available in the right top corner of the "Intelligent Robotic Process Automation Cloud Studio" Page then Click on Dependencies (If the following dependencies are not available, shown in the below Image then Click on "Add Dependency" and add all dependencies shown below in the Image).
- We will get an empty workflow and then drag and drop the required activities.
- Drag and drop the “Get File Collection " from the Activities and give the Folder path in which the PDF is saved to the input parameters.
- Drag and drop the “For Each” from the Activities. Inside the loop for the first file in the folder, the collection we use is "Split PDF Every X Pages”, It will Split all pages of the first PDF file, and all the split PDF pages will save to the given folder path. Select "Split PDF Every X Pages" and give "inputFilePath", "outputFilepath" and "splitEveryXPages" will be 1.
- inputFilePath: - To give the path of the folder in which the PDF needs to be split is stored.
- outputFilePath: - Provide the path of the folder where the split PDF is going to store.
- Drag and drop the ‘Get File Collection’ and give the path of the folder in which split pdfs are present to the input parameter.
- Drag and Drop "Open Excel Instance" to use all activities related to Excel.
- Drag and Drop " Open Workbook". Once it is opened, use the "Set Active Cell” and give the Path of the Excel sheet.
- Drag and drop "Set active Cell" to set the active cell in the active worksheet. Give the "cellDefinition” and set cell number ‘A1’ as an active cell.
- Drag and drop the "Push Value" activity and set the header Column names which belong to our Extracted Invoice Fields for example "Push Invnum". Repeat this step for all other two fields (invoice date and gross amount).
- Drag and drop "Set active Cell" and set the second row in the first column as the active cell. Give the "cellDefinition" value "A2" (See below Image). It means the next data will populate from the A2 cell.
- Drag and drop the "For Each" activity. In this step, each PDF file will iterate one by one from your split Folder.
The process to create a "Document Template"
- Under the "Artifact" click on the plus icon then click on create, select "Document Template”, and provide a name for automation.
- Give Your Template "Name" and upload one sample Invoice then click on "Next", Select "Invoice" and then Click on "Next".
- Select "Create New" then Write Your Schema "Name".
- Scroll down and select your relevant field in Header Fields Section, for example, "document Number", "document Date", "gross Amount" etc., see in the below two Images. Click on "Add".
- Click "Open in a new tab"
- If "Processing Status" is "DONE", then Click on your Document (here Invoice-7439-combined.pdf) then click on "Annotate".
- Click "Edit" and move the Cursor to Invoice Number then select (here 7439) and select "documentnumber" in the relevant "Field". Repeat this step for all three fields and save.
- Under the "Artifact", Click on your created Template (here my_First_PDF_Template) then click on "Annotate in a new tab" and "Activate". Now your Template is ready to use.
- End of the Process to create "Extract Data (Template)".
Now back to Automation Process
- Drag and Drop "Extract Data (Template)" from activities.
- Select the "Document Schema" that you created.
- Give your "documentPath" (here you need to supply a split PDF Folder path).
- Drag and drop the "Push Values" activity (See the below Image). It will set your extracted value to the relevant Field in the excel sheet Push values activity is used to enter the column headers in the excel file. Set the second row in the first column as an active cell using the "Set active cell" activity. Repeat this step for all other two fields (Invoice date and gross amount).
- Drag and Drop "Get Row to retrieve the row index from a cell address.
- Drag and Drop "Set Active Cell" and provide “Cell Definition”.
- Drag and Drop Log Message to display your Extracted Data in Console Window.
- Drag and drop the "Close workbook" and "Close excel instance" activities to save and close the excel file.
The final automation workflow:
Output: -
- Extracted data will be populated in Excel Sheet.
Link to the running bot:-
https://youtu.be/R27Z7PK3UOE
Conclusion: -
We extracted data from invoices and display it in an excel sheet, using Process Automation features with less coding experience.
Bot building an easy and fast with SAP Process Automation. The new SDKs Activities are easy to use, drag and drop without any coding knowledge. This automation will benefit the roles of Process Executioner, Business Process Lead, and Business Process SME.
The motive of this blog is to inspire customers, RPA developers, and business managers, to consider adopting SAP Process Automation in various automation situations by following easy procedures that supply excellent outcomes.
Hope you enjoyed this article and was helpful to you. Please provide feedback in the comment section and feel free to ask any questions in the SAP Process Automation
Q&A area.
Looking forward to more learning,
robinsh97