Technology Blogs by SAP
Learn how to extend and personalize SAP applications. Follow the SAP technology blog for insights into SAP BTP, ABAP, SAP Analytics Cloud, SAP HANA, and more.
cancel
Showing results for 
Search instead for 
Did you mean: 
senthurshree
Associate
Associate
2,753

Introduction

Data must be complete and consistent if it is to provide accurate analytics. If inconsistencies existed in the data, it has always been a tough process to identify the exact erroneous records and to take action.

This new data validation framework provides initial validation rules to check the consistency of your data with the modelling metadata like the view key columns and hierarchy nodes and provides detailed output about the erroneous records in case of inconsistencies.

The goal is subsequently enrich the data validation framework with more validation rules. In the outlook section below, there is link to the roadmap items on the new rules that will get added to this framework. Please add comments on this blog post about other validations that you think might be beneficial for you.

Two Types of Validation Rules have been released.

  1. Key Validation Rules
  2. Hierarchy Validation Rules

Key Validation Rules

The following rules are available in the Keys category:

Rule

Description

Unique Key Values

Each record must have unique primary key values.

For example, if an object has primary key columns ID and Language and a record has primary key values of 1 and EN, then no other record may have this combination of values.

No Null Key Values

No record may have null primary key values.

For example, if an object has primary key columns ID and Language no record may have null values in these columns.

 

Hierarchy Validation Rules

The following rules are available in the Hierarchies category:

Rule

Description

No Multiple Parents

No hierarchy node may have multiple parents.

For example, if a hierarchy node A has a parent node B, then it may not have any other parent.

No Circular Hierarchies

No hierarchy node may have circular parent-child relationships.

For example, if a hierarchy node A has a parent node B, then it may not also have node B as a direct child or other descendant.

 

Running a Data Validation

Key validations are relevant in the realm of view modelling because there they are important during modelling (e.g. when creating associations), but unlike with tables the database does not check that the view data indeed have key properties.

It has always been a difficult process to identify what records cause the inconsistencies in the model. Additionally, not every modeller might have full access to the data. With this new approach CopySQL/ Copy SQLScript introduced as part of the data validation framework, any modeller can see the results of the validation even though they might not be allowed to see all the details of the induvidual rows. 

Even users with limited restrictions are allowed to trigger the validations.

The Data Validation Status button in the view properties panel shows the status of the last data validation, and the Validated On field shows the date and time of the last run.

senthurshree_13-1719499486820.png

 

To validate data (or review the results of a previous validation):

  1. Click the Data Validation Status button as indicated above to open the Data Validation panel. Alternatively, if the Data Preview panel is already open, click the Data Validation tab.

senthurshree_14-1719499733686.png

2. If the panel is empty (or if you want to re-run the validation), click Validate as indicated above.

The time required for data validation depends on the number of records and the complexity of calculations required. You will receive a notification when validation is complete.

3. Review the validation results. There is one result for each rule as shown below,

senthurshree_15-1719499733688.png

 

Reviewing Rules with Errors:

  1. For rules with errors, click the Details link to get further information. The Details dialog explains the rule and lets you copy SQL (or SQLScript) code to generate a list of invalid records.

senthurshree_16-1719499876399.png

2. In the Details dialog, click Copy SQL (or Copy SQLScript) to get the code necessary to generate the full list of records that are not consistent with the rule

senthurshree_17-1719499876400.png

senthurshree_18-1719499876401.png

Viewing Erroneous Records - 'Copy SQL'

Click on 'Copy SQL' and then Create a new SQL view, paste the code into it, and preview the data to see these failing records. Examples below,

a. Unique Key Values - Doing a data preview on the Copied SQL in the New SQL view, shows the records that violate the Unique Key Constraints along with the number of their occurrences.

senthurshree_21-1719500234638.png

b. No Null Values - Doing a data preview on the Copied SQL in the SQL view, shows all the 9 records where NULL values are present in the primary key column 'LeafNode'

senthurshree_22-1719500234640.png

Viewing Erroneous Records - 'Copy SQLScript '

Click on 'Copy SQLScript' and then Create a new SQL view, paste the code into it. Choose the SQLScript (TableFunction) in the view properties and add the columns as well as shown below. Then save and deploy the view and then click on Data Preview.

 

senthurshree_23-1719500365544.png

a. No Multiple Parents - Doing a data preview on the Copied and deployed SQLScript(Table Function) type SQL View, shows the records that have multiple parents.

senthurshree_24-1719500365546.png

You can see that the Tops Child has two Parents Men's Wear and Women's wear which is indicated in the above output.

senthurshree_25-1719500365547.png

b. No Circular Hierarchies - Doing a data preview on the Copied and deployed SQLScript(Table Function) type SQL View, shows the records that have a circular hierarchy relationship

senthurshree_26-1719500365548.png

You can see below the Circular Hierarchy relationship in the dataset as indicated in the above output

senthurshree_27-1719500365550.png

Current Limitations

Currently, some validation rules may be unavailable if:

  1. Your view or any of its sources is protected by a data access control.
  2. Your view or any of its sources contains input parameters.
  3. Any of your view's sources is not replicated to SAP Datasphere.

However, these limitations will be eliminated in the future with updates to the validation framework.

Outlook

As mentioned above, we will be adding more rules/validations to this validation framework. For more details about additional validations coming in Q4 2024, refer to the roadmap item - Additional model validations

  • Referential integrity – for associated entities, system checks on whether for every member in the association source's foreign key there is a respective member in the association target
  • Association consistency – checks to help ensure that associations that use non-key fields still have key-like properties and therefore avoid unintended data duplication

 

Summary

With the data validation framework, you should now be able to validate your metadata & data and correct errors easily based on the information available on the erroneous records.

I would like to thank @MohanambalDuraisamy  for her collaboration in this blog.

 

 

 

 

 

 

 

4 Comments
JulianJuraske
Participant

Hello,

thank you for the detailed explanation!
A nice Feature if it's capabilitys increase and there are no Restricions.

Is it planned to make this Feature (that you did manually) a part of a Taskchain ? 
There is no Point in further increasing the Features of supported Checks if you have to do it manually for every View! No one will ever do that ;).

Any of your view's sources is not replicated to SAP Datasphere.

-> SAP always promotes that the Data has not to persisted again within the DSP.
How come that you release a Feature that does not support the Federate Approach ? 
Is this planned? 

MKreitlein
Active Contributor
0 Kudos

Hello @senthurshree 

thanks for this informative Blog.

One question came to my mind: Why do you have this validation inside a view, and not inside the table itself?

I mean if you have inconsistent data... then it is stored in the table, not the view... so, why placing the "Validate" button there?

Thanks, Martin

0 Kudos

Hi @JulianJuraske,

The validations framework  has been released with a goal of preventing the run time errors and to assist the modeller for an efficient modeling. As informed in the blog, there will be more validations coming up and based on the depth of validation further decisions on including them as part of task chain can be considered. 

Br,

Mona

Hi @MKreitlein ,

At the moment, the main objective of the validation framework is to assist and prevent the modeling run time error and that's why currently the views are considered for validations. But as you mentioned in terms of key and null validations, we can push it to table as well. More information on this will be updated on road map explorer - Additional model validations to be delivered very soon.

Br,

Mona