updated date: 31.Oct.2023
This blog is part of the Business Continuity with RISE and BTP blog series:
part 1 – Concept Explained 👈
part 2 – Technical Building Blocks in RISE
part 3 – Technical Building Blocks in BTP
Nowadays more and more enterprises have digitalized their business with enterprise solutions (like SAP business applications and platforms). Business continuity with enterprise solutions is the level of readiness of a business to maintain critical functions after an emergency or disruption.
When companies are thinking of their enterprise solutions in the cloud (hereby, the most dominant cloud infrastructure providers are Azure, AWS, and GCP), despite the benefit of lower TCO and less operational overhead, it also gives the opportunity to redesign the business continuity for their business-critical workloads, to ensure the availability of data, applications, and platform being prepared for the potential waves of disruptions.
Therefore, in this blog series, we come back to the fundamentals, and focus on answering the questions of, 'What is?' and 'What makes it possible'.
1. Service Model Explained
On-Premises Model |
- provider provide: software license
- customer manage: Networking, Storage, Servers, Virtualisation, O/S, Middleware, Runtime, Data, Applications
- this used to be the service mode, SAP provided enterprise softwares
|
Infrastructure as a Service (IaaS) |
- customer manage: O/S, Middleware, Runtime, Data, Applications
- the dominant IaaS provides in the market are: AWS, Azure, GCP
|
Platform as a Service (PaaS) |
- customer manage: Data, Applications
- examples: SAP BTP (including SAP Analytics Cloud, Datasphere)
|
Software as a Service (SaaS) |
- fully managed service
- example: SAP Concur, SAP Ariba, SAP Successfactors, SAP Fieldglass
- RISE with SAP, Private Cloud Edition is a SaaS-like managed service
|
2. Business Continuity Explained
Business continuity is mainly ensured by 4 key parts: High Availability, Disaster Recovery, Data Management, Change Management.
High Availability is about recovery from single entities’ errors, typically a broken server or network switch. HA is measured by Service Level Agreement (
SLA). The higher SLA systems are been promised, the less downtime they will suffer.
Disaster Recovery represents the resilience of the application / system foundation after entire system failures, in case of catastrophic events like earthquakes or floods. DR is measured by
RTO and
RPO. Disaster Recovery can be built as In-Region (2 Availability Zone within 1 Region) or Cross-Region (2 Regions).
Data Management includes Data Backup, Data Replication, Snapshot, and Data Restore. Details see section 3.4.
Change Management avoids the application / system impairments caused by changes (code change, or system upgrade) on a previously working state. With standardised and automated pipeline, changes can be better governed and regulated. In section 3.5, we will talk about automation in change management.
3. Key Technical Components
3.1. Trustworthy infrastructure
RISE with SAP Private Cloud Edition is built on top of trustworthy infrastructure provided by Azure, AWS, and GCP, who are
the leaders in Cloud Infrastructure and Platform Services (according to Gartner Magic Quadrant).
Hyperscalers (Azure, AWS, GCP) group
Data Centers with
Cloud Computing resources, into
Availability Zones, then several availability zones within certain close physical distances can compose a
Region.
Microsoft Azure (Azure) |
|
Amazon Web Services (AWS) |
|
Google Cloud (GCP) |
|
3.2. Monitoring, Failover, and Load Balancing
Monitoring refers to the practice of continuously monitoring the various components and systems involved in a high availability (HA) architecture. Effective monitoring in a high availability environment serves for several purposes: early detection of failures, proactive maintenance, performance optimization, SLA compliance.
Failover in high availability refers to the process of automatically switching from a failed or degraded primary system to a secondary system in order to maintain uninterrupted service. It is a key component of high availability architecture, which aims to minimize downtime and ensure continuous operation of critical systems.
Load Balancing is the process of distributing a set of tasks over a set of resources (computing units), with the aim of making their overall processing more efficient. A Load Balancer is an entity of load balancing mechanism which applies monitoring and failover. With load balancers, the primary component failure and be detected, and then the traffic can be redirected to the redundant component.
3.3. Compute Redundancy and Clustering
Redundancy is the intentional duplication of critical components or functions of a system with the goal of increasing reliability of the system.
Clustering (also known as high-availability clusters, or fail-over clusters), is a mechanism to group several resources for similar purpose as nodes into one cluster. Thereby, the cluster can be seen as one unit when accessing the resources for the similar purpose. And the nodes within one cluster can potentially be used for load balancing or failover, one to another, in case any availability or scalability scenarios took place.
3.4. Data Redundancy (Backup, Snapshot, Replication, Recovery)
Data Backup is to create the redundancy of the compute data, so that it may be used to restore the original after a data loss event.
Snapshot is a method to create backup, and is mostly on at the
volume level.
Data Replication has 2 categories: Streaming Replication (can be synchronous or asynchronous) and Backup Replication. Streaming Replication is transaction-based can be at disk-level, OS-level, Database-level, or Application-level. Backup Replication can be Snapshot-based (more efficient), or treating backups as File-based.
Data Recovery is the process of retrieving deleted, inaccessible, lost, corrupted, damaged, or formatted data from data backup.
3.5. Automation via IaC and CI/CD
Infrastructure as Code (IaC) is the managing and provisioning of infrastructure through code instead of through manual processes. IaC helps automate infrastructure change. The
benefit of having IaC includes: Cost reduction, Increase in speed of deployments, Reduce errors, Improve infrastructure consistency, Eliminate configuration drift. IaC can be integrated into CI/CD pipeline.
CI/CD is the key part of
DevOps practice, and greatly automates processes in software development. Though having frequent changes may increase the possibility of impairments, having accumulative changes into one can be more detrimental (due to complexity of the change). Therefore, it is more recommended to have small and frequent changes, with governance - Continues Integration/Continues Delivery (CI/CD).
Disclaimer
- The blog content does not necessarily represent the official opinion of SAP, Microsoft, Amazon Web Services, or Google Cloud. The opinions appearing in this blog are backed by SAP, Azure, AWS, GCP documentation which can be revealed in the corresponding reference links.
- The blog content is only focusing on technical discussion, hence can not be used as commercial basis, nor should be used as SAP official offering documentation.
Acknowledgment to contributors/reviewers/advisors:
Ke Ma (a.k.a. Mark), author, Senior Cloud Architect, RISE Cloud Advisory RA group
Special THANK YOU to RISE with SAP community members, who contributed to this blog:
Ferry Mulyadi, Partner Solution Architect, Amazon Web Services
Micah Waldman, Product Management Lead, Google Cloud Business Continuity
Thorsten Staerk, Customer Engineer, Google Cloud
Frank Gong, Digital Customer Engagement Manager, SAP ECS
Marc Koderer, Chief Architect, SAP ECS
Boris Maeck, Head of Technology and Architecture, SAP ECS
Chun Yuan, DevOps Engineer, SAP BTP Cloud Foundry Platform
Zabala Silvestre, Product Owner, SAP BTP Cloud Foundry Platform
Aaron Smyth, Principle Service Architect, SAP
Sven Bedorf, Head of Cloud Architecture & Advisory, RISE Cloud Advisory, MEE
Kevin Flanagan, Head of Cloud Architecture & Advisory, RISE Cloud Advisory, EMEA North
Luc DUCOIN, Cloud Architect & Advisor Expert, RISE Cloud Advisory, EMEA North
Richard Traut, Head of Cloud Architecture & Advisory, RISE Cloud Advisory, EMEA North
Peter van den Berg, Cloud Architect & Advisor Expert, RISE Cloud Advisory, MEE
Extended Reading:
Reliability Pillar, from AWS Well-Architected Framework
AWS Prescriptive Guidance - Resilience lifecycle framework, by AWS
Disaster Recovery of Workloads on AWS, from AWS Well-Architected Framework
Some more previous blogs:
DNS integration with SAP RISE in multi-cloud environment series guide – Azure
DNS integration with SAP RISE in multi-cloud environment series guide – AWS
DNS integration with SAP RISE in multi-cloud environment series guide – GCP
Harmonized Single Sign-On for SAP RISE Customers in Multi-Cloud Environment
Demystify Single Sign-On on Server Side for SAP RISE Customers
empower SAP RISE enterprise users with Azure OpenAI in multi-cloud environment
Unlock the Power of Business Data for SAP RISE Customers: Mastering Data Management and Cultivating ...
Extend the Power of Data for SAP RISE Customers: data federation with SAP in multi-cloud GCP
Extend the Power of Data for SAP RISE Customers: data federation with SAP in multi-cloud AWS
Extend the Power of Data for SAP RISE Customers: data federation with SAP in multi-cloud Azure