on 2011 Dec 16 6:08 PM
Hello expert,
for BW 3.*, in infopackage there are several updating method to choose from.
for method "PSA and Data Targets/InfoObjects in Parallel (By Package)" and "PSA and then to Data Target/InfoObject (by Package)", global data do not exist when uploading data from PSA to data target, but for method " Only PSA", the global data is kept as long as the process with which the data is processed is in existence.
so my questions are:
(1) if there is no global data, I still want to do some processing for global data, I can save those individual data package into internal table . and start to process global data on this internal table, is that right?
(2) how to choose different uploading method as method "PSA and Data Targets/InfoObjects in Parallel (By Package)" and "PSA and then to Data Target/InfoObject (by Package)"? I don't think first method is parallel loading, it is still serial by data package. all those methods are only for parallel loading by partition in PSA, am I right?
(3) which one is performed best in performance perspective?
Many Thanks,
Hello expert,
>
> for BW 3.*, in infopackage there are several updating method to choose from.
>
> for method "PSA and Data Targets/InfoObjects in Parallel (By Package)" and "PSA and then to Data Target/InfoObject (by Package)", global data do not exist when uploading data from PSA to data target, but for method " Only PSA", the global data is kept as long as the process with which the data is processed is in existence.
>
> so my questions are:
>
> (1) if there is no global data, I still want to do some processing for global data, I can save those individual data package into internal table . and start to process global data on this internal table, is that right?
In my experience there is no guaranteed way to have global memory across all data packages. For example, say you have 3 parallel processes, and 6 data packages. Most likely data packages 1 and 4 will get processed in the first process, DP 2 and 5 in the 2nd, and DP 3 and 6 in the 3rd. If you populated global variables, they would only contain data from the two data packages that got processed in that process. Some work-arounds people use are:
1) Set the data package size very high so that it guarantees all data arrives in a single data package. Not recommended. Think what may happen in exceptional cases such as initial data loads, or extraordinarily high levels of change in the source. You may very well short dump with out of memory errors.
2) Stage the data to an interim DSO so you have all data stored in BW, then do a data mart load to further targets. At that point you can use open SQL to query the active data table of the interim DSO. Write-optimized DSOs work particularly well for this.
> (2) how to choose different uploading method as method "PSA and Data Targets/InfoObjects in Parallel (By Package)" and "PSA and then to Data Target/InfoObject (by Package)"? I don't think first method is parallel loading, it is still serial by data package. all those methods are only for parallel loading by partition in PSA, am I right?
"PSA and Data Targets/InfoObjects in Parallel (By Package)" is parallel in the sense that it writes to both the PSA and the data target simultaneously. It is also parallel if there is more than one data package, but only to the degree that you have parallel process configured (often 3).
> (3) which one is performed best in performance perspective?
"PSA and Data Targets in Parallel" will be faster than "PSA and then into Data Targets" since it writes to both the PSA and data targets simultaneously. "Only PSA" is slowest, since the subsequent update from PSA to the data is done sequentially in a single background process.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
SAP did try to take a stab at global memory by way of semantic partitioning in DTPs where you could group the data by predefined keys before it went in.
The data package option is also not very reliable because the packaging is done by record number or package size whichever is reached first.
You could use the only PSA option at the expense of data loading speed if you still want to look up the etire contents of the load before loading. The most common scenario is to have an ODS - self loop the same and then load it upwards
User | Count |
---|---|
71 | |
10 | |
8 | |
7 | |
6 | |
6 | |
6 | |
6 | |
6 | |
5 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.