cancel
Showing results for 
Search instead for 
Did you mean: 

SAC planning model - delete duplicate records

antonclv
Explorer
0 Kudos
691

Hi All

I am wanting to delete duplicate records from a model using a specified key (combination of dimensions).  For example, in the below "BEFORE" image, I have 2 records for Customer = '123'.  I would like to delete the top record where Customer Group = '#'.  I do not want to delete all records Customer Group = '#', because for the likes of Customer = '456', there is only 1 record.

I have tried coding an advanced formula data action, but can't seem to get it right.

Can someone please assist with the correct code needed, or perhaps suggest another way of giving the same result?

Thanks

Anton

 

antonclv_0-1720787798526.png

 

MichalCK
Participant

you need to aggregate on customer group and divide not aggregated value by aggregated. If the result is less than 1 then you have duplicates and you delete Customer Group #

 

VARIABLEMEMBER #CGROUP OF [d/CustomerGroup]

DATA([d/CustomerGroup] = #CGROUP) = RESULTLOOKUP()

IF RESULTLOOKUP() / RESULTLOOKUP([d/CustomerGroup] = #CGROUP) < 1 THEN
	DELETE([d/CustomerGroup] = "#")
ENDIF

 

more or less like that. I've not tested the above code, it should be possible to remove unwanted records. But if it is worth the effort, depends.

antonclv
Explorer
0 Kudos
@MichalCK - thanks for the suggestion, I'll see if that works.
MaxT
Explorer
0 Kudos

@MichalCK, I was aware of that. I tried your suggestion too but then I got this error message instead:

MaxT_0-1721205250035.png

 

MichalCK
Participant
@MaxT, you have to specify a measure either through MEMBERSET [d/Measure] = "XXXX" or directly in RESULTLOOKUPs

Accepted Solutions (0)

Answers (2)

Answers (2)

MaxT
Explorer
0 Kudos

Hi @MichalCK,

Could you show me a code where this worked? I think I've tried all possibly options, including your suggestions but I don't seem to obtain the preferred outcome.

Thanks in advance! 

MichalCK
Participant
0 Kudos
hi @MaxT, I don't have same example as you do. Code I gave you is a template which after some modifications should work
MichalCK
Participant
0 Kudos
Your case does not seem to be complicated. The code I wrote, does aggregate CustomerGroup into variable, then we check if current value divided by aggregated value is lower than 1. This will indicate that there are multiple CustomerGroup values. And then we delete # Customer group. I'm not able to provide you exact solution, as each model is specific and code would need to be adjusted to meet your data model (like eg. measure selection if you have multiple measures).
MaxT
Explorer
0 Kudos
@MichalCK, Could you a snipped of the code where you aggregated a dimension into a variable? Because I could understand how that would make your initial code possible.
MichalCK
Participant
0 Kudos
imagine aggregation as temporary table where value of aggregated column is replaced by #CGROUP, and amount is aggregated to 4000 (in your case). When you divide CustomerGroup AA by #CGROUP you will get value 0.5 which is < 1, and then we delete # Customer Group (this will execute probably few times, but will hit # just once). My code snippet will not help you. Please try to debug it to see what is there in data.
JefB
Active Contributor
0 Kudos

Something like this?

MEMBERSET [d/CUSTOMER] = "123"
MEMBERSET [d/CUSTOMERGROUP] = "#"
MEMBERSET [d/SALESORG] = "1000"
MEMBERSET [d/MEASURE] = "Amount"
DELETE()
antonclv
Explorer
Hi @JefB - thanks for the reply, but I am looking for a more dynamic solution as I have thousands of "duplicate" records in the table where one record has Customer Group = '#' and the other has it populated. The screenshot I gave was just for reference purposes.
deshmukh
Participant
0 Kudos
just trying to understand, cant we handle this in source system if it’s difficult in SAC
JefB
Active Contributor
0 Kudos
Then you can use MichalCK's code example, should work
MaxT
Explorer
0 Kudos

Hi @JefB 

I'm struggling with a similar issue. Therefore I was very happy to read this thread and try your proposed code.

However, when that is tried, it gives the error message that the operator "/" cannot be calculated. I think the reason for that is because this is done on a dimension instead of measure.

Therefore, I was curious if you have any suggestions on how to overcome that issue?

JefB
Active Contributor
0 Kudos
Hi, if you share your exact code it would be easier to recommend a solution. On MichalCK's code above I'm not so sure the division in IF works as is (at least not without round brackets () around it) and I would try to do it with something like IF RESULTLOOKUP()>0 THEN DELETE([d/CustomerGroup] = "#") ENDIF
MaxT
Explorer
0 Kudos

Apologies @JefB, I've tried to code from @MichalCK and to give a snipped of what I've resulted in the error mentioned in my previous message:

MaxT_0-1721202125959.png

This was the closest I came to trying his code.

So indeed the challenge I'm facing is that I have duplicated records where the record is once with Dimension "Deelneming" filled (anything else then "#") and once with it being "#". 

However, I also have cases of Dimension "Deelneming" being "#" where it doesn't contain another value. Those cases I don't want to delete but I only want to delete the cases of the duplicate records.

Hope that makes it a bit more clear, if not please let me know what additionally I need to share to give you a better idea.

MichalCK
Participant
0 Kudos

Hi @MaxT,

there is an issue in your code line 3, please try to use

IF RESULTLOOKUP() / RESULTLOOKUP([d/Deelneming] = #GROOTBOEKREKENING) < 1