
In this blog, we will describe how SAP Datasphere can handle the "automatic handling of Cutoff Dates" from multiple attributes of the same Entity. More specific, on a recent engagement the customer wanted to migrate its current SAP BW infrastructure to SAP Datasphere one, but there was no direct solution regarding this specific situation.
Business Case
On this scenario, the Employee Entity could be described by many different attributes, such as "English Level" / "Employee Category" / "Division" and so on. During time, it was possible to have different values which could unique identify the Employee Entity.
For example, if we want to describe the Employee Category during time, we can easily conclude that for a period T1 the employee can has the value "Junior" and then for the upcoming or current period T2 can has a different value,such as "Senior". If we extend this, to all the required attributes, we conclude to numerous possible transitions which can overlap among them or not, during time. SAP BW is able to handle this kind of request directly but SAP Datasphere not yet.
Demo on SAP Datasphere
Two local tables created on Data Builder for the Employees attributes regarding "English Level" and "Category". Each one, had the different "picture" of Employee for a specific period. Then, by using the "Analytical View" option of Data Builder we create a sql-script which constructs the final result. Each combination of possible attributes per Employee should be unique per period.
This is a limited demo, on a real scenario the number of the possible Attributes could be dozens. Furthermore, a concern regarding the performance of this implementation came up from the customer. SAP Datasphere is able to provide two approaches on this question. Either by materializing the Result set on a Table either by using the Data Persistence option of Data Builder (materialize the ResultSet in memory).
Following is a generalised sql script which can be adapted to any scenario. Comments are provided on the script how to extend it.
SELECT DISTINCT
EMP_ID,ENGLISH_LEVEL,CATEGORY
,min(x)over(partition BY EMP_ID,ENGLISH_LEVEL,CATEGORY/*HERE I HAVE TO ADD AND ALL THE ADDED DIMENSION ATTRIBUTES */ ) AS VALID_FROM
,min(xx)over(partition BY EMP_ID,ENGLISH_LEVEL,CATEGORY/*HERE I HAVE TO ADD AND ALL THE ADDED DIMENSION ATTRIBUTES */ ) AS VALID_TO
--STEP 4 I APPLY THE RULE USING THE ROW_NUMBER =1 (MIN AND MAX) AND THEN I REDUCE 1 DAY ON THE VALID_TO CLM , EXCEPT THE LAST ONE PERIOD--
FROM (
SELECT
EMP_ID,ENGLISH_LEVEL,CATEGORY,
VALID_FROM,VALID_TO
,CASE WHEN RN_FROM = 1 THEN VALID_FROM END AS x
,CASE WHEN RN_TO = 1 THEN CASE WHEN VALID_TO = TO_DATE('2999-12-31') THEN TO_DATE('2999-12-31') ELSE VALID_TO END END AS xx
--STEP 3 OUR GOAL IS TO GROUP IN ONE ROW "SIMILAR GROUPS". BECAUSE I HAVE THE INFO IN MORE THAN 1 ROW. SO I HAVE TO ADD A ROW_NUMBER
-- ON ON THE VALID_FROM AND VALID_TO COLUMN IN ORDER TO BE ABLE TO TAKE THE MIN AND MAX COUPLE PER GROUPS
---
FROM (SELECT
EMP_ID,ENGLISH_LEVEL,CATEGORY,
VALID_FROM,VALID_TO,
ROW_NUMBER()over(partition BY EMP_ID,ENGLISH_LEVEL,CATEGORY/*HERE I HAVE TO ADD AND ALL THE ADDED DIMENSION ATTRIBUTES */ ORDER BY VALID_FROM) AS RN_FROM,
ROW_NUMBER()over(partition BY EMP_ID,ENGLISH_LEVEL,CATEGORY/*HERE I HAVE TO ADD AND ALL THE ADDED DIMENSION ATTRIBUTES */ ORDER BY VALID_TO desc) AS RN_TO
FROM (
SELECT EMP_ID,ENGLISH_LEVEL,CATEGORY,/* CONTINUE TO ADD THE ATTRIBUTES FROM THE REMAINING DIMENSIONS */
VALID_FROM,VALID_TO
FROM (
SELECT a.EMP_ID,a.VALID_FROM,a.VALID_TO,
b.ENGLISH_LEVEL
,c.CATEGORY
/* CONTINUE TO ADD THE ATTRIBUTES FROM THE REMAINING DIMENSIONS */
from
------------------------STEP 2
(SELECT * FROM
(SELECT
EMP_ID ,DATE_ AS VALID_FROM ,add_days(lead(DATE_)over(PARTITION BY EMP_ID ORDER BY DATE_),-1) AS VALID_TO
--STEP 1 , UNION ALL THE VALID FROM AND VALID TO FROM DIFFERENT_DIMENSIONS
FROM (
SELECT EMP_ID ,VALID_FROM AS DATE_ FROM ENGLISH_DIMENSION
UNION
SELECT EMP_ID ,VALID_TO FROM ENGLISH_DIMENSION
UNION
SELECT EMP_ID ,VALID_FROM FROM CATEGORY_DIMENSION
UNION
SELECT EMP_ID, VALID_TO FROM CATEGORY_DIM ENSION
/* UNION SELECT EMP_ID, VALID_TO FROM DIMENSION_x ...... */
)
)WHERE VALID_TO IS NOT null --EXCLUDE THE LAST DATE (DO NOT CREATE PERIOD)
)a
----------------------STEP 2 , FROM THE PERIOD CREATED I WILL JOIN TO ALL AVAILABLE DIMENSIONS
inner JOIN ENGLISH_DIMENSION b
ON a.EMP_ID = b.EMP_ID
AND a.VALID_FROM BETWEEN b.VALID_FROM AND b.VALID_TO
inner JOIN CATEGORY_DIMENSION c
ON a.EMP_ID = c.EMP_ID
AND a.VALID_FROM BETWEEN c.VALID_FROM AND c.VALID_TO
/* CONTINUE TO ADD THE INNER JOIN FROM THE REMAINING DIMENSIONS
inner JOIN CATEGORY_DIMENSION_x x
ON a.EMP_ID = x.EMP_ID
AND a.VALID_FROM BETWEEN x.VALID_FROM AND x.VALID_TO
...
...
inner join ...
*/
)
)
)
)
ORDER BY VALID_FROM
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
User | Count |
---|---|
24 | |
23 | |
22 | |
15 | |
12 | |
10 | |
9 | |
7 | |
7 | |
7 |