Lambda architecture is designed to handle analytics for continuously changing high volume data https://en.wikipedia.org/wiki/Lambda_architecture . Lambda Architecture, when applied to ERP (Enterprise Resource Planning) data, enables the processing and analysis of large volumes of transactional and operational data, catering to both historical and real-time analytical needs. This consists of a Batch Layer, which processes data in a batch process that runs on preset time intervals. Simultaneously, the Speed Layer handles real-time data streams, providing instant insights into live sales figures, stock levels, and other critical metrics using technologies like Kafka. The Reporting Layer merges outputs from both the Batch and Speed layers, offering a unified view of ERP data for quick access, which is essential for dashboards, ad-hoc reporting, and querying by various business applications.
Lambda Architecture
Here S/4 HANA System shall be used as a data source and Datasphere shall be used for replicating the data from S/4 HANA System into layers. Setup overview shall look like the following..
Lambda Architecture using SAP Datasphere and SAP HANA Cloud
Here CDS Views of SAP S/4 HANA System are used as data sources. These should be enabled for data extraction with valid delta ( CDC delta mechanism ) setup.
The speed layer consists of 2 components:
The batch layer consists of 2 components:
The reporting layer consists of 2 components:
The main component of this architecture and orchestrator of the data. Replication Flows component and spaces concept of the SAP Datasphere are employed which replicate the data from CDS Views of SAP S/4 HANA System to the chosen target.
More information on Datasphere spaces can be found here https://learning.sap.com/learning-journeys/explore-sap-datasphere/introducing-sap-datasphere-spaces or https://developers.sap.com/tutorials/data-warehouse-cloud-4-spaces..html .
Currently, within a given Datasphere space, a data source (CDS View of SAP S/4 HANA System) can only be replicated to one target. This shall change once the new feature on the roadmap is delivered https://roadmaps.sap.com/board?range=CURRENT-LAST&PRODUCT=73555000100800002141#Q1%202025;INNO=BBD862... .
Datasphere Spaces for Batch and Speed Layers
Connection to SAP S/4 HANA OnPremise system needs Cloud Connector setup. Detailed steps are available here
Pass the connection test
Follow the steps in the blog use SAP BTP Compliant Kafka and connect to SAP Datasphere https://community.sap.com/t5/technology-blogs-by-sap/sap-datasphere-replication-flows-blog-series-pa... .
6. Run the Replication Flow and check data in the target ( here Kafka Topic )
Check data ( Kafka Topic is created and data is replicated into the Kafka Topic)
Note:- Checking a topic details (UI or CLI ) will be different based on the chosen Kafka
Data should be read from Kafka topic and transformed, enriched and finally the results are written to HANA Cloud Database in the corresponding “speed layer” table.
While this can be done in many ways, a simple BTP App is depicted here and detailed steps involved like data cleansing, transformation, enrichment (lookups etc.) depend on the scenario. COMPANYCODE example chosen here is a master data and not suitable for such steps and hence skipped.
Repeat the steps of this connection from “Speed Layer” again. As mentioned earlier, this is needed until feature of multiple targets for same datasource is released https://roadmaps.sap.com/board?range=CURRENT-LAST&PRODUCT=73555000100800002141#Q1%202025;INNO=BBD862... .
Provision the instance of HANA Cloud, Data Lake Files in HANA Cloud Cockpit (Relational Engine is not required)
Create a connection to HANA Cloud, Data Lake Files using the steps provided https://help.sap.com/docs/SAP_DATASPHERE/be5967d099974c69b77f4549425ca4c0/356e41e880e54255891b702d2a... or https://community.sap.com/t5/technology-blogs-by-members/exporting-tables-from-datasphere-to-hana-da... .
Pass the connection test..
Data Lake based analytics using Apache Spark is a huge topic with many architectures. But, Medallion lakehouse Architecture (https://learn.microsoft.com/en-us/azure/databricks/lakehouse/medallion ) is suitable for S/4 HANA data as it’s a transaction data with deltas. This can be implemented using delta.io (https://delta.io/blog/delta-lake-medallion-architecture/ ) libraries directly or using databricks (https://www.databricks.com/glossary/medallion-architecture ) or other frameworks and tools. This part is skipped here.
Reporting layer in this scenario consists of HANA Cloud Data Lake Files (containing batch layer results as a Gold layer tables) and HANA Cloud Database (containing speed layer results).
HANA Cloud Native Modeling can be used for developing the Calculation Views that unions the data from speed and batch layers and finally enrich the same with Calculation Views with star-join nodes.
Alternatively, Datasphere’s Views and Analytical Models in Data Builder can also be used.
Analytical Models of Datasphere or Calculation Views of HANA Cloud developed above can be used to create Dashboards and Analysis to support the reporting.
Datasphere’s Connections, Replication Flows along with HANA Cloud Database and HANA Cloud Data Lake Files can be used to achieve the lambda architecture for high volume S/4 HANA Systems with real-time reporting needs.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
| User | Count |
|---|---|
| 36 | |
| 28 | |
| 27 | |
| 26 | |
| 26 | |
| 26 | |
| 24 | |
| 23 | |
| 22 | |
| 22 |