Technology Blogs by SAP
Learn how to extend and personalize SAP applications. Follow the SAP technology blog for insights into SAP BTP, ABAP, SAP Analytics Cloud, SAP HANA, and more.
Showing results for 
Search instead for 
Did you mean: 
Former Member

Configuring Best Record Using Data Services Designer

A data cleansing solution created by Data Cleansing Advisor can be used in a dataflow within Data Services Workbench and Data Services Designer.  The solution that gets published does both cleansing and matching, but not best record.  There are two options to pursue if you want best record functionality: Data Services Designer and Information Steward’s Match Review tool.  “Match Review with DCA” is the next subject covered further down the article.  This section will focus extending the concepts learned in “Publishing to Data Services Designer” by adding best record functionality to the dataflow that was created.

The data cleansing solution used in the previous example is a simple cleansing and matching dataflow that outputs data to a single target
and is depicted below:

Adding Best Record Functionality

The match transform that gets published to Data Services Workbench is not able to be modified other than the selection of various output fields to be included within the output schema.  Best Record is a post-match process that is usually done within the same match transform that does the matching.  In order to add best record functionality to a published data cleansing solution the following will need to be completed.

Add a case transform to the dataflow.  The case transform will be used to route matching records (those with a valid integer value for a match group number) to
the match transform that will be performing the best record calculations.  The unique records (those with a blank match score) will be routed to a query transform since these are already considered best records since they have no other matching records associated with them.

The next step is to add a match transform to the dataflow to perform the best record calculations.  The input of this transform should be the matching records that are being routed from the case transform defined earlier.  The image below shows how this best record dataflow should be designed.  A full image of the entire dataflow can be found further below for your reference.

Configuring a match transform to purely do best record calculations is fairly easy if you do have Data Services experience.  Examples of the input fields required are displayed below.  MATCH_GROUP_NUMBER is required so that it can be used to form break groups; grouping records based upon group number and using those groups to perform best record on.  The other input fields listed below will be used within our best record rules and to post data to when a master or subordinate needs to be updated.

You can select to use any best record strategy with the data cleansing advisor solution and it’s completely customizable to suit your business requirements.  Once best record is configured the last step that needs to be taken is to determine the output records that you want to be populated to your target.  I used a query transform (“Called Best_Records”) to filter records that were either a master record or unique record.

Below is the completed best record dataflow using a data cleansing solution published from Information Steward.  The full power and flexibility of Data
Services can be used to extend the functionality of a data cleansing solution.

Data Cleansing Advisor Best Practices Blog Series

Determining Duplicates and a Matching Strategy

Publishing to Data Services Designer

Configuring Best Record Using Data Services Designer

Match Review with Data Cleansing Advisor (DCA)

Data Quality Assessment for Party Data

Using Data Cleansing Advisor (DCA) to Estimate Match Review Tasks

Creating a Data Cleansing Solution for Multiple Sources

1 Comment