Application Development Blog Posts
Learn and share on deeper, cross technology development topics such as integration and connectivity, automation, cloud extensibility, developing at scale, and security.
cancel
Showing results for 
Search instead for 
Did you mean: 
Former Member
Content


 

Why you need Data Scrambling?


IT systems often keeping sensitive information like Protected Health Information (PHI), Personally Identifiable Information (PII), Sensitive Personal Information (SPI) etc. Companies are responsibility to adhere policies to be complaint with data protection laws such as GDPR, HIPAA, PIPEDA, APPI as well as ISO 27001 standard that represents an international standard for security certification that outlines the best practice framework for managing processes, technology, and people.

It is important to mask sensitive data in non-production environments to make it not available for authorized users like developers, solution architects, testers, and others both internal employees and vendors who are usually working in entire landscape that is been refreshed from the production system.

To achieve this goal there are some SAP products that can do data masking like SAP TDMS as well as 3rd party tools that works on DB level. Obviously, they consider additional license costs and time for implementation.

Solution described here is a custom build ABAP program masks data of table fields directly in database. Sensitive data can exist in both standard and custom tables, so list of table fields can vary, but can be covered with editable template. Program was developed some years ago, but currently can be shared as open source. Minimum requirement NW740.

Test data


For the demo we can use well known tables used in trainings. Firstly, install DST from GitHub using abapGit to one of your sandbox systems. Link to DST GitHub is provided at the end of this post.

Generate some test data for tables SPFLI, SFLIGHT, SBOOK using generator t-code BC_DATA_GEN.

Similarly, we can generate data in tables STICKET and SNVOICE with the report se38->SFLIGHT_DATA_GEN.


After data was generated run t-code ZDST.

Template Editor


First. with a help of a template editor, it is needed to create a template that will be contain tables with fields names that should be scrambled.

After going to Change Mode (‘CHANGE / VIEW’ button) let’s add some data. Most probably in example below only personal data like SBOOK-PASSNAME (‘Name of the Passenger’) should be scrambled but let’s add some more fields just to extend demo template.


Double click on Table Name will show the table content that will be used for scrambling. It is handy to check data before/after scrambling. That how the SBOOK looks like now (Table Keys + Fields selected for scrambling).


Please remember that template can be edited only in sandbox/development systems (client role – C – Customizing).

Keys


What if not all the data from entered to the template Table-Field should be scrambled. For example, we need to scramble data from ADR6 and ADRP tables but need to exclude certain groups of persons. For this typical requirement there it is possible to add keys that will be used to scramble data.

Going back to our demo template – let’s add keys that will allow to apply scrambling of SBOOK-PASSNAME only First Class passengers (SBOOK-CLASS = ‘C’). To do this in Template Editor double click on ‘+’ in Key column for required line of Table/Field, select ‘Class’, then ‘Apply Selected Items, add ‘First Class’ in Dynamic Selection and Save.


Added keys are flagged in Key column with Key icon.


Double click on SBOOK will show data of Business Class passengers that is ready to be scrambled in SBOOK.


There is no limit for amount of tables, fields and keys added, that makes possible to create quite complex templates.

Scrambling functionality


Scrambled data has the same data length as original data but filled with random alpha-numeric characters. Considering scrambling different amounts of data, it can be run both in foreground and background mode.

Background mode is working perfectly fine, but honestly for now foreground mode was fast enough to scramble data for cases I was using it and amount of data required masking was not small.

Let’s scramble data in foreground based on the created template. After finishing a pop-up with details on amount of scrambled data will be shown. Double-clicking on SBOOK will assure data can not be read anymore.

Press "START DATA SCRAMBLING" and confirm.


Scrambling completion is followed with pop-up with logs. Feel free to close it, same information is saved in Application Log.


Double click on SBOOK to check the results. 2 table columns from the template - Customer Number (CUSTOMID) and Name of the Passenger (PASSNAME) are scrambled for First Class passengers (SBOOK-CLASS = ‘C’).


Please bare in mind - DST will not run Scrambling on Production system (Client role – P – Production). If you brave enough, you can try, but most probably it will just show a funny message.

Application Log


All action made in DST are being saved to Application Log with Object ZDST. Log can can be checked by hitting ‘Application Log’ button or directly from tcode SLG1 using object ZDST.

In the log there could be found details on performed scrambling as well as on template change.



Transport


After template was created, saved, and tested it can be transported from development system to other systems of the landscape. Transport with template data can be created by pressing ‘TRANSPORT’ button.


Template can be imported to Production and in this case after the refresh non-production systems will already have template data in place, ready to start scrambling.

Authorizations


DST was meant to be used by 2 groups of users. One group should be able to edit a template (functional consultants) and second to run data scrambling after system refreshes from Production (BASIS).

Accordingly, there is a custom authorization object Z_S_DST with activity ‘02’ for template edit and ‘16’ to execute scrambling.

Providing access to t-code ZDST without assigning mentioned authorizations to all SAP consultants and business analysts can be handy and ZDST in display only mode will give a hint what data was scrambled in current systems.

Information


Pressing ‘INFORMATION’ button will show a document with DST details including prerequisites.

Conclusion


Program has various checks in UI like warning modified template was not saved when exiting etc, that makes usage intuitive and safe.

It is possible to scramble table key fields but adding them to the template will show a warning pop-up as there is a very small theoretical risk that after scrambling key fields with different values but with the same length will get the same value after the scrambling, that can lead to data inconsistency.

DST is been used in couple of companies for last few years, so far there were only good feedbacks on functionalities and scrambling speed.

 

Thanks for reading this article. I hope you found it interesting and DST will be useful. Please feel free to reach me in case of any suggestions or if you need additional info.

 

GitHub - Data Scrambling Tool (DST)


 

 
5 Comments
shais
Participant
A nice initiative.

Just to make it clear (I think it should be mentioned in bold): The solution scrambles the data directly in the database tables.

 
Former Member
That is correct, DST scrambles data directly in DB (as it should be).

I suggest to play with test data first, as it described in the post. It would be safe enough to understand how the tool works. When it comes to try scrambling based on a real data template - it would be good make a try on a sandbox first.
this_yash
Participant
The idea is good and essential.

But would functionals/testers have to descramble the data every time they test? Since the SQL Query is just going to pick whatever there is. Or am I missing an important link? 🤔
Former Member
0 Kudos
Once table fields were updated with scrambled values - there is no straightforward way to recover the data. In case you need new not scrambled test data, test data sets can be re-generated using programs mentioned here - Test data.
NRK
Discoverer
0 Kudos

Nice Blog and very well written.

Regarding the point mentioned in Conclusion: Key fields is very interesting, and few questions raised in my mind.

It is possible to scramble table key fields..... but with the same length will get the same value after the scrambling, that can lead to data inconsistency. 

1) What kind of issues can be observed because of data inconsistency 

2)Scramble table key fields Means- Copy the record, delete the record from Database level and scramble key field and insert new records at database level?  

3) In some cases, scramble key fields we have to generate new value for every record, how we are making sure this.

 

Labels in this area