This document provides some common design guidelines for SAP BW on HANA.
BW modelling approach using LSA++
— Acqusition Layer
—Propagation Layer
—Architected Datamart Layer
—Virtual Datamart Layer
Important prerequisites during dataflow design
- The standard data flow template can be based as a starting point.
- A rough idea of data volumes, both per load cycle and in total.
- The level of complexity in transformations that is required in the overall design (simple/medium/complex).
Standard dataflow templates to be based as the first option
- For a requirement that contains non-cumulative, use the “Standard Data Flow for Non Cumulative Scenario”.
- For all other requirements, use the “Standard Data Flow“.
- The acquisition layer and virtual data mart layer are always mandatory.
Basic guidelines to build Acquisition layer
- • Each source object (extractor) must have exactly one target object in the acquisition layer
- • All fields available in the source must be mapped and loaded into the target object
- • No logic in transformation on fields delivered by the source is required. An exception to this rule is when it is required to remove unwanted characters which might cause data load failures in BW
- • Addition of fields is allowed only for the following conditions
- • The added fields relate to the remaining fields in the record
- • The added field can be derived using only data from within the source record itself. The derivation logic is not dependent on the business meaning of the data.
Basic guidelines to build the Virtual Data Mart layer
- • This layer is always mandatory for any development that has reports as output.
- •
- • This layer must only contain objects of these types:
- • Multiproviders
- • Virtual Providers
- • Composite Providers
- • Hybrid Providers
- • Transient Providers
Basic Guidelines to build the Architected data mart layer
- • The objects in this layer are populated from the acquisition layer directly, or from the propagation layer. The data loads will take place on a schedule determined by business requirements.
- • It is modelled with regular DSOs.
- • Wherever possible, only one layer of objects should be used in this architectural layer.
- • It is in the load to this layer that business transformations are performed, that is, transformations specific to the individual business requirements of the top level DSOs/objects that will be used for the specific reporting requirements of the project.
Basic Guidelines to build the Propagation layer
- • This layer is to be used to perform EDW transformations. EDW transformations comprises of:
- § Transformations that do not rely on business meaning of the data
- § They apply to all rows, all the time for all functional areas that use the data
- • The propagation layer will also be used to prepare data on a more frequent schedule than the layers above.
Common guidelines for all layers
• By default, all available navigational attributes should be included (switched on) in the objects of all layers (from acquisition up to and including the virtual data mart layer). The only exceptions are the special security navigational attributes. Those will be managed by the authorizations team.
Usage of DSOs for Master data
• For most infoobjects that require master data, no separate DSO needs to be created to store the same master data. Except when
- • The data needs to be propagated upwards with delta capabilities
- • The master data is delivered by the source(s) in distinct pieces instead of in one coherent source (datasource or table)
Use of Standards DSOs
• Standard DSO is used in the following layers:
• Acquisition Layer – storing source data as received by the source system
• Propagation Layer – optional layer technically combining source data to provide semantics to the source data.
ADM Layer – adding fields which are business-specific. DSOs in this layer form the reporting basis which are included in MultiProviders to report off.
Standard DSO - Change Log Properties
In an HANA-optimized DSO, changes are not physically stored in the Change Log table anymore. The active data table of a DSO is a so-called temporal table where records are not physically overwritten. Instead, write operations always insert new versions of the data record into the database. The different versions of a data record have timestamp-like attributes that indicate their validity.
The most recent versions in temporal tables are called current data and this is what is shown when you display the Active data. The change log is a Calculation view that makes use of these time-stamped records to construct the change history. Periodically deleting the change log is still required to clean up the DSO.
Standard DSO - Indexes
• Indexes which only consist of 1 column are not required anymore on a HANA database since data is store columnar and by definition each column is an index. However, indexes with multiple fields are still effective especially if these columns are used in a SELECT FOR ALL ENTRIES statement. The HANA in-memory database stores data in both rows and columns. It is this combination of both storage approaches that produces the speed, flexibility and performance of the HANA database.
Write-Optimized DSO - Technical Settings
Property | Setting | Comment |
Type of DSO | Write-optimized |
SID Generation | During Reporting |
Allow Duplicate Data Records | Yes | No unique secondary key for the fields defined in Semantic Key will be generated, allowing multiple records with the same Semantic Key |
Check Delta Consistency | No |
Semantic Key | Include the InfoObjects that form the logical key of the data |
Write Optimised DSO – Usage
Write-optimized should only be used in the following layers:
- • Acquisition Layer – if the DataSource will always deliver new records and never changes for example with Material Movements
- • Corporate Memory Layer – use as storage object for Corporate Memory
Semantic Partitioned Object – Technical Settings
• Always change the standard description of the PartProviders to something meaningful.
• Partition criteria:
- • All characteristics are allowed for an InfoCube but only key fields are allowed for a DSO
- • For a DSO, place the partitioning criteria first in the Primary Key of the DSO
Semantic Partitioned Objects – Usage
- • SPO will be the default object to implement if Logical Partitioning is required. However if performance is too much impeded by the additional processing cause by the generated InfoSources, manually built Logical Partitions are allowed.
- • If you use an SPO to feed another SPO:
- • Do not use SPO’s directly as Source or Target for the Transformation, because DTP’s in that case only have Delta Mode FULL. Instead always connect the Outbound and Inbound InfoSources with the central Transformation and create DTP’s from PartProvider to PartProvider.
InfoCube – Technical Settings
• Due to the fact that Dimension tables are not used anymore, Multidimensional Modelling is also not necessary anymore. Use the Dimensions for logical grouping of the characteristics.
• Physical (database) partitioning cannot be set anymore in a BW on HANA environment. This is handled internally by HANA and automatically 4 partitions are created:
- Partition 1 – non-compressed requests
- Partition 2 – compressed requests
- Partition 3 – reference points for inventory data
- Partition 4 – historic movements of inventory data
• Partitions 3 and 4 are always created regardless of inventory/non-cumulative key figures present
Transformations – Some guidelines
- Read Master Data should be used with caution. From a modelling perspective lookup and store an attribute persistently means you want to report historical truth: what is the value of the attribute at that specific point in time. If the requirement is to report historical data against the current value of the attribute, a Navigational attribute should be used.
- The Referential integrity check for Master Data bearing Characteristics should always be switched on in the 1st level Transformation (source is PSA) if the Source System is a non-SAP system.
• Multiproviders – Usage
- If SPOs are used, include the SPO in the Multiprovider instead of the individual PartProviders.
- Use the MPRO Dimensions for logical grouping of characteristics from a query developer’s perspective
- If the MultiProvider contains base-InfoProviders (other than SPO’s, query pruning for SPO’s is automatically handled) that are logically partitioned for example by Year, query pruning based on Metadata restrictions should be used for optimal query performance.
- Use ABAP program RSPP_PART_MAINTAIN to enter the partition criteria for a specific MultiProvider.
Composite Providers – Usage
- Composite Providers cannot be used in MultiProviders
- Possibility to combine InfoProviders via UNION, INNER JOIN and LEFT OUTER JOIN
- Always need 1 “initial” InfoProvider with a UNION connection as the basis
- INNER versus LEFT OUTER JOIN: an INNER JOIN is always faster, but this might not always fit the requirement. If the performance degradation due to the OUTER JOIN is too big, it might be worth the effort to model referential integrity during staging.
- Composite Provider versus MultiProvider: Always model a MultiProvider if the model only contains UNIONs. The BW OLAP Engine ensures the best execution plan is used.
- Composite Providers can only be used in BW on HANA (and partially with BWA)
- Composite Provider versus InfoSets: If the InfoSet only contains data that can be modelled also in a CompositeProvider, the CompositeProvider is the better (faster) choice.
• Data Transfer Process – Technical Settings
- DTP from PSA should always have extraction mode Delta and never Full
- In other DTPs, the default extraction mode should also be Delta, but there could be requirements that justify a Full but this should be avoided as much as possible
- Error handling: use default “No update, No Reporting”.
- If error handling is required, the semantic grouping in the DTP defines the key of the error stack. If a record has an error, then all subsequent records with the same key are put into the error stack and not updated in the target, regardless of whether or not they have errors. This makes it possible to ensure that records don't get out of order when using the error stack.
DTP to Master Data Object:
- If the source is a DSO, set the Flag to Handle Duplicate Records