Motivation:
I'm quite certain that these terms - Data Mesh, Data Products, Self-Service, Governance, Catalog and Data Fabric - aren't completely new to you. You've likely encountered or read about them before.
These are part of every conversation that I am having with my colleagues and customers who are either on their journey to modernize their data and analytics landscape or are looking to get an edge and want to stay ahead by continuously implementing latest innovations and data architectures to make it more purposeful for their business, users, and employees.
With the launch of
SAP Datasphere (formerly, known as SAP Data Warehouse Cloud) I was wondering how and where these terms fit in if we focus on Data Mesh with SAP Datasphere.
So, in this blog I am trying to share some scenarios to implement Data Mesh with SAP Datasphere.
But, before that, a little bit of background (not detailed - for details refer
wolfgang.albert.epting ) on Data Mesh , its principles and SAP Datasphere. I am not going to discuss Data Mesh and SAP Datasphere in detail in this blog (as there are many other blogs, sources where these have already been discussed in detail). However, I will be sharing some links in this blog.
Before we start, I would also like to thank my dear colleague
florian91 for the valuable inputs.
Intro:
What is Data Mesh? - Data Mesh is a decentralized data management architecture, where the ownership of data lies with the domain teams (e.g., Sales, Finance etc.) instead of one Central data team.
The most important thing to note is that Data Mesh is not a product or a tool, it’s a concept. So, how you implement and with which technology will vary.
Important is also to first understand the four principles of Data Mesh and see which SAP Datasphere features meet the criteria of these principles.
1) Domain Ownership - This principle states that domain teams should take the responsibility of the data and should own it (analytically and operationally). After all, no one understands the data better than the domain teams themselves, as they are the ones who create the data and possess a deep understanding of the business needs.
Mapping with SAP Datasphere --> SPACES are the central objects which can meet the criteria of this principle. SAP Datasphere SPACES serve as independent work environments for individual departments, LOBs, data domains, project teams and other user groups or individuals. With the help of SPACES domain teams can control the access of data products and make them available for consumption outside their domain. More on SPACES
2) Data as a Product - According to this principle if a team (domain team) needs to consume data owned by another domain team (e.g., Finance needs data from Sales) then it’s the responsibility of the domain team owning the data to make the required data available for other domain teams in an agreed consumable format.
Mapping with SAP Datasphere --> Data Marketplace feature provides platform to exchange (consume and share) data via Data Products both internally and externally. One can also define the visibility of the data exchanged via Contexts (Private, Internal or Public) and can also control the access (with license keys, with fee or free) More on Data Marketplace
3) Self-Service Data Platform - This is a place where different domain teams will create/maintain/consume data products according to the guidelines set by the central governing team.
Mapping with SAP Datasphere -->This is the Datasphere tenant itself running on SAP BTP platform on SAP HANA Cloud. Where one can integrate data (Federate or Replicate) from any source (SAP or Non-SAP) and create data models and expose them as data products.
4) Federated Governance - This means there should be standards (minimum standards that all data products should meet e.g., naming conventions, sharing formats, structure of data products, documentation, federated vs replication decision criteria) defined (centrally) for all cross domain sharing and consumption of data products in a standardized and consistent way across the enterprise.
Mapping with SAP Datasphere -->These are the general best practices or guidelines that will be specific to the enterprise related to the build, management of data products and objects. However, in terms of Datasphere features, Data Catalog is the central place where one can discover, curate, and organize metadata about Assets. More on Catalog
Since, I am going to refer Data Marketplace in the scenarios that will be discussed in this blog, so, it is important to understand the concept of
Context and
Data provider visibility in the Data Marketplace. More details you can find on
Using Contexts to Realize Public, Private, and Internal Data Marketplaces | SAP Help Portal
Types of Data Marketplace
Your data products are visible to all consumers of SAP Datasphere without restrictions. Every consumer can find and acquire your data products.
You can create and run your own private data marketplace instead of publishing in the Public Data Marketplace. Only invited users can search and acquire your data products. It's also possible to invite other data providers to publish in your private data marketplace.
- Internal Data Marketplaces
You can create, or contribute, to internal data marketplaces. The visibility of internal data marketplaces is restricted to the members of specific tenants, or to individual users
- Data Marketplaces owned by other data providers
You can become a contributor in data marketplaces, which are run by other data providers (using activation keys)
Contexts in Data Marketplace
Context Type for Private Data Provider |
Data Shop |
You are the only Data Provider and invited users will be able to find and acquire your data products using activation keys however, there cannot be other data providers |
Private Data Products |
In this case invited consumers with data provider activation keys can become data providers to share their data with you (context owner) for the creation of data products |
Private Data Exchange |
Build up a data exchange with multiple data providers and consumers (also from different companies) |
Context Type for Internal Data Provider |
Internal Data Marketplace |
All the users of a tenant can acquire data products internally.
Users from other tenants can only see these data products if they've been granted access individually with an activation key, or if their tenant was added explicitly. One can also control which users from Internal tenant allowed to access data products |
Below is the architecture of SAP Datasphere with all the key capabilities:
Scenarios:
Scenario 1 - Single tenant of SAP Datasphere
Each domain team is having their own dedicated SPACE (decentralized) to create and deploy artifacts (data models/Views) to meet their individual analytical needs and also the needs of cross domain teams (via data products) - Two options to share data products in this scenario.
1.
Using SPACES - Each domain team can share data products directly with other domain teams via SPACES (blue highlighted lines) e.g., Finance can share and consume data products directly with and from other domain teams (using SPACES)
OR
2.
Publishing to Marketplace - A domain team (SPACE) can publish data products on Data Marketplace from where they can be consumed by other domain teams (SPACES) in a self-service way. This kind of marketplace is called Internal Data Marketplace. e.g., Finance publishing data products on the Data Marketplace (yellow highlighted line) and are consumed by Central, Sales and R&D domain teams (SPACES) (green highlighted lines).
PS: Before publishing data products Data Provider Profile must be created or activated using Data Sharing Cockpit – My Data Provider Profile app.
The Data Provider team can control who can access the data products also within the internal tenant using license keys and invite selected users and can also hide the data products using My Context App from Data Marketplace section.
3.
Data catalog can provide the necessary metadata about the deployed artifacts and self-service capabilities for data discovery and consumption from the same tenant. Data Catalog Users can find and access the artifacts within the Datasphere by searching/filtering.
Scenario 2 - Multiple tenants of SAP Datasphere
Now, building on top of Scenario 1, say there are multiple SAP Datasphere tenants in the landscape, some possibilities, for example:
- Tenant 1 for Region - North America and Tenant 2 for Region - APAC*
OR
- Tenant 2 for a particular Company Code (say 1000) and Tenant 1 for all the other Company Codes
In this scenario, if
Tenant 1 needs to access data from
Tenant 2 (say to consolidate)
1) Tenant 2 will act as Data Provider and can publish its data products on the Data Marketplace, which can be consumed by Central SPACE of Tenant 1 (green highlighted line). This kind of Data Marketplace scenario can be External/Private Data Marketplace scenario.
PS: Data can be consumed by every user who is part of the context/has license. So, in this case Central SPACE users should be added to the Context/provided licenses.
2) Within each tenant, domain teams can share data products using SPACES or Data Marketplace as discussed in Scenario 1 (i.e., using Internal Marketplace)
3) Again, Data Catalog can provide the necessary metadata about the deployed artifacts and self-service capabilities for data discovery and consumption by Tenant 1 and Tenant 2 Datasphere.
*PS: SAP Datasphere tenants should be in the same landscape. This you can find from SAP Datasphere tenant URL (e.g. eu10, us10 etc.)
Scenario 3 - Multiple tenants of SAP Datasphere
Similar to Scenario 2 above, however in this scenario a domain team (Sales) has its own dedicated SAP Datasphere tenant (e.g., because of special performance/compute requirements, data volume, data privacy etc.)
Data product sharing will be similar to Scenario 2, i.e., Tenant 2 publishing it's data products on the Data Marketplace and consumed by Tenant 1 (by invited users) or vice-versa. Data catalog can provide the necessary metadata about the deployed artifacts and self-service capabilities for data discovery and consumption of content by Tenant 1 and Tenant 2 Datasphere.
Closing:
So, if we now put all the above scenarios against the principles of Data Mesh then these scenario meets the criteria of Data Mesh principles:
- Decentralized
- Data as Product
- Self-Service
- Governance*
*Assuming there will be is a Central platform team to manage the overall SAP Datasphere platform/landscape and set the ground rules/guidelines.
I have just shard three scenarios with two SAP Datasphere tenants, however, there could be many more possibilities where there are more than two SAP Datasphere tenants and BI tools.
Learning content and useful links on Datasphere:
Find out how to unleash the power of your business data with SAP’s free learning content on
SAP Datasphere. It’s designed to help you enrich your data projects, simplify the data landscape, and make the most out of your investment. Check out even more role-based learning resources and opportunities to get certified in one place on
SAP Learning site.
What's New in SAP Datasphere | SAP Help Portal
SAP Datasphere | SAP Help Portal
Getting Started with SAP Datasphere | SAP Help Portal
SAP Road Map Explorer
Governing and Publishing Catalog Assets | SAP Help Portal
Data Marketplace - Data Provider's Guide | SAP Help Portal
Thank You!
Harji
------------------------
Harjinder Singh