As a PaPM developer, I have noticed a common trait across many PaPM environments. There are a large number of redundant and duplicated functions. The root cause of the redundancy appears to stem from PaPM being an SQL-based tool, and just like most SQL statements have a “FROM” clause, most PaPM functions require the use of “input functions” either at the function or rule level, which means if you have several process chains that need to use the same set of functions, there does not many good options to reuse existing functions, which leads to a lot of duplicaiton and redundancy. I have started to look for ways to help combat this problem to both reduce the development time and reduce maintenance and support.
To help illustrate the point, I will use a simple example. Suppose there is an env with three processes defined in the calculation unit. I will refer to these as "Process1", "Process2" and "Process3”. For this exercise, we also have three writers, “Writer1”, “Writer2”, and “Writer3”, corresponding to each process. Each Calc Unit process also has a Parameter P_ACTIVEPROCESS, which is assigned a number equal to the process number (1, 2, 3). The goal of this exercise is to find ways minimize redundant functions by having one set of “common core” functions that all three processes use while maintaining the unique “pre-processing” and “post-processing” functions that make each process chain unique.
I can think of two approaches for allowing for the elimination of redundant functions. Option 1 gets the job done but is far from elegant and has risks of data cross-contamination if another process starts before the first one has been completed. Option 2, I think, is better but requires the use of "complex selections" and other capabilities that are not well documented. Below I walk through each Option to explain how each could work.
Option 1:
- Create a "common repository" using either a Model BW or Model Table function. This “common repository” will temporarily store the output of each process’s "pre-process" functions.
- Add a Writer function to the end of the “pre-process” functions and set the Writer’s option for "complete destruction of data target content" equal to "Yes".
- Add the new Writer function to Calculation Unit “Process1” as an execution step before the final Writer1 execution step.
- The new writer function will write the “pre-process” data to the “central repository” using the "complete destruction of data target content" option to “flush and fill” the “common repository” each time “Process1” is executed.
- Once the "pre-process" data has been written to the "common repository", the “common core” functions will be triggered by the “Writer1” at the end of “Process1”. This second part of the process will extract the data from the “common repository”, use the “common core” functions to process the data, and then write the results to Processes' data repository.
Problems/Observations with Option 1:
- There is a risk that more than one Calculation Process could be running simultaneously. For example, if “Process1” is still running and “Process2” is started, it could potentially result in mixing “Process1” and “Process2” data in the “common repository”.
- This option breaks up the process flow into multiple steps and adds a data write back to the processes. If working with large data volumes this could cause performance hits vs. if the “common core” functions were replicated for each process.
Option 2:
- Create a Parameter called P_ActiveProcess, and each Calculation Unit Process would be assigned a value for this parameter, such as 1, 2, 3, that aligns with each process. For example, if “Process1” is executed, P_ActiveProcess would be set to equal 1. The same would apply to “Process2” and “Process3”.
- The first function of the “common core” would likely need to be a JOIN/UNION ALL. It is a “gatekeeper” function that only allows one processes' data to pass. The first FROM join type would need a complex selection like “WHERE :P_ACTIVEPROCESS = 1” for Process1. The same complex selection would be needed for “Porcess2” and “Process3”. Even though we are using UNION ALL join type for Process 2 & 3, it will not pass data from source unless the calling process has the matching value in P_ACTIVEPROCESS.
- Once the “gatekeeper” function is in place, each process can be executed, and the output of the “pre-process” functions are passed into the “common core” functions, and if the P_ACTIVEPROCESS numbers match, then the data is allowed to pass.
- Once the “common core” functions have been completed, the date would be written to each process’s unique data repository using their associated Writer.
Problems/Observations for Option 2:
- The first functions work OK as the gatekeeper if the functions in the process chain are run using the correct Process. I have found during testing if you pick “Process2” as the Process when running Writer1, it has no problem writing Process2’s data to Process1’s data repository. As far as I know, there is no way via a complex selection to determine which Process was selected to run the Writer function, even though it shows up in the “FS_” variables in the output.
- This first issue may not be a showstopper if running the process chain via My Activities or BW process variant, as Write1 should only be called with Process1. The only way to circumnavigate this is by running the Writer function manually from My Environments and selecting the wrong Process.
- Another option is to add an environment variable and manually set this value equal to the process being run in the early steps of each process. For example, while pre-processing the data for Process1, set a variable, like Y_PROCESS = 1, and then when Write1 is run, compare the values of Y_PROCESS vs. P_ACTIVEPROCESS and halt if they don’t match. The “common core” functions still execute on the wrong data, but at least the results would not be written back to the wrong data repository.
That sums up most of what I know about reducing duplicate and redundant functions. I would be interested in hearing from other community members. There are probably numerous ways to improve on both Options 1 and 2 above, and likely options I have not thought about. Thanks in advance for your contribution to this topic.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.