TEXT DATA PROCESSING & ANALYTICS

Technology Blog Posts by Members

Read only

8,929

Unstructured Data Processing through Data Services:

Text Data Processing is all about being able to take unstructured textual data and turn it into something you can analyze and act on.
It allows you to deal with information overload by mining very large corpora of words and making sense of it without having to read every sentence!
This article deals with Text Data Processing using SAP Business Objects Data Services with the intension of Text Analytics.
Entity Extraction transform available as a part of Text Data Processing of Data Services, helps to extract entities, entity relationships and facts from unstructured data for downstream analytics

Case study:

There was a vendor information in a text and email files, which holds European Countries and Cities.
Based on the Country name, the country codes are identified and matched with Vendor Master Table LFA1.LAND1

File Format:

Create a New File Format with File type as “Unstructured text”
Enter the File name(s) to process. Also we can use wild-character in this placeholder as *.*

Data Flow:

Next in the dataflow, place a Base_EntityExtraction transform of Data Services, after the unstructured file format. Link the transform with the file format.

Base Entity Transform:

This transform provides a user friendly GUI interface, having three tabs namely Input, Options and Output. The transform accepts textual format such as a text, HTML, or XML.

Check the Options tab and set the Language value accordingly, in my case English. Leave the rest of the options as it is.
On the Output tab select the fields of interest. Best practice deals with only the fields STANDARD_FORM and TYPE. By default the output schema of the transform will generate a maximum of 11 fields. We can use them if we want.

Text Analysis - Results:

Next we just add a Template Table as target. Run the job and lets check the meaningful text data extracted by the transform

To make more meaningful, in the next data flow the Country Codes are identified and matched with LFA1Table.

1 Comment

You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.

Top liked authors

User

Count