10 Best Practice Recommendations for working with SAP Datasphere
SAP Datasphere, SAP’s cloud-based data warehouse and data fabric solution has become a cornerstone for modern data architectures and gained visibility during the past months, especially as it is the core of SAP Business Data Cloud. SAP Datasphere enables organizations to integrate, model, and consume data across hybrid landscapes. To unlock its full potential, companies should follow a set of proven best practices.
Here is a “best 10” list:
1. Establish a clear data architecture. Define layers, domains, ownership, and naming conventions upfront to avoid inconsistencies later. Naming conventions should indicate stable properties of your objects, such as:
2. Use Spaces. Use as many spaces as needed to effectively separate business units, projects, or environments, ensuring proper governance and isolation. It’s a good idea to have different spaces, for example, for specific consumption areas. But don’t create too many spaces as they need to be maintained in scoped roles, more objects need to be shared, and resources need to be allocated if you use spaces highly.
3. Adopt a data product mindset. Build reusable, well-documented datasets with clear business meaning. Use the built-in options for documenting the artifact’s business purpose.
4. Indicate Semantic Usage. Use relational dataset only for low-level objects. On high level, use fact for data that contains aggregated measures, and dimension for pure language-independent attributes based on a specific key value. Use text for language-dependent descriptions. Note that changing the semantic usage is possible but may require additional specifications, e. g. a language field.
5. Leverage the Analytic Model for business-specific KPIs and field names, but ensure consistent definitions for base measures, and dimensions in the underlying views.
6. Minimize data replication. Whenever possible, use virtualization to reduce redundancy. Of course, there are cases when remote access of mass data repeatedly for several queries is too slow or puts too much stress on the source system. In such cases, implement a replication flow and transformation flow with delta capture for a performance-optimized ELT process.
7. Design models with performance in mind. Reduce the data set early. Apply filters directly on original fields, push filters down to the source system. Prefer associations over joins, optimize joins, avoid unnecessary calculations, and monitor query execution. A key best practice is to adopt a tiered data architecture. Define different tables or partitions for different time slices. Frequently accessed, business-critical data (hot data) should remain in Datasphere’s “hot” core storage (in memory) for optimal performance. Less frequently requested data can be kept in “warm” storage (on disc), or even in a “cold”, file-based storage space to reduce costs (capacity units).
8. Implement strong security practices. Define roles for space-level and activity-based authorization control with as few overlap as possible. Use data access controls (DAC) to control data visibility mainly on one level, for example on the highest view before the Analytic model. Implement auditing and export audit log entries before they are deleted. Enable a password policy. Use encrypted communication for network communication channels.
9. Establish lifecycle management. Define CI/CD processes to handle changes reliably across environments (tenants or spaces). Pre-define topic-related packages to combine what needs to be transported in the same way, but keep packages small enough to handle the data imports. If you redefine the space during import, develop and follow a clear guideline for space mapping.
10. Finally, integrate SAP Datasphere within SAP Business Data Cloud. Enrich the possibilities with data science analytics tools like SAP Databricks or Snowflake. Use SAP Analytics Cloud to deliver business value through intelligent applications, and use seamless planning to combine the possibilities of both tools.
By combining governance, performance optimization, and user-centric design, organizations can build a scalable and future-proof data foundation with SAP Datasphere.
Final remark
Many aspects of best practice are specific to your needs and skills. For example, if you are familiar with SQL, you may be able to optimize the data access in a complex case by a smart SQL script in an SQL view. However, as the system will stick to your sequence of steps, if your program design does not match your data patterns, especially if patterns in your data vary, a graph view may be the better choice. (The system follows general performance optimization strategies for graph views which can make them faster even if they are badly designed.)
Moreover, check out our product home page SAP Datasphere | SAP Help Portal | SAP Help Portal especially the SAP Datasphere Security Recommendations | SAP Help Portal.
Feel free to ask your questions and present your own recommendations here in the community. For discussing your specific questions directly with a subject matter expert, visit our instructor-led training DSP01 – Introduction to SAP Datasphere:
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
| User | Count |
|---|---|
| 9 | |
| 4 | |
| 2 | |
| 2 | |
| 2 | |
| 1 | |
| 1 | |
| 1 |