In today's rapidly evolving technological landscape, enterprises are increasingly relying on sophisticated solutions to manage and harness the power of their data. SAP Data Intelligence, a comprehensive data management solution, is a crucial player in this domain. Coupled with the flexibility of Kubernetes-based platforms like OpenShift and innovative provisioning technologies like HyperShift, SAP Data Intelligence becomes even more powerful. In this blog, we'll delve into the process of creating a nested OpenShift environment using HyperShift and explore how SAP Data Intelligence can be seamlessly installed based on this setup for testing and validation purposes.
Understanding Nested OpenShift and HyperShift
Nested OpenShift: Nested OpenShift involves deploying an OpenShift cluster (a container orchestration platform) within another OpenShift cluster. This architecture enables developers and administrators to experiment with multi-cluster setups and test various configurations in a controlled environment.
HyperShift: HyperShift is an innovative provisioning technology that facilitates the deployment of nested OpenShift clusters. It's designed to simplify and automate the process of setting up complex nested architectures. By leveraging HyperShift, you can quickly spin up nested OpenShift instances without the need for manual configuration, saving valuable time and resources. While it provides the hosted Control Planes feature, the control plane components run as pods in an existing base cluster inside a dedicated namespace for each hosted Cluster. The nodes of the base cluster host the control plane.
The list below highlights the benefits of using the HyperShift KubeVirt provider:
- Enhance resource utilization by packing multiple hosted control planes and hosted clusters into the same underlying bare metal infrastructure.
- Strong isolation by separating hosted control planes and guest clusters.
- Reduce cluster provisioning time by eliminating the baremetal node bootstrapping process.
- Manage multiple different releases under the same base OCP cluster
Creating a Nested OpenShift Cluster Using HyperShift
Step 1: Prerequisites
Before you begin, ensure you have the following in place:
OCP 4.12+ is running as the underlying base OCP cluster on top of bare metal nodes (HyperShift will be GA in OCP 4.14 in October 2023).
The required operators and controllers are listed as follows:
- OpenShift Data Foundation (ODF) using local storage devices
- OpenShift Virtualization
- MetalLB
- Multicluster Engine
- Cluster Manager
- HyperShift
Step 2: Provisioning the Nested OpenShift Cluster
Please follow the detailed installation instructions on Red Hat Hybrid Cloud blog, “Effortlessly And Efficiently Provision OpenShift Clusters With OpenShift Virtualization”, to set up the nested cluster. In general, the following components need to be installed.
OpenShift Data Foundation
In bare metal nodes, OpenShift Data Foundation (ODF) with local storage devices can be used as the default software defined storage to persist the guest cluster etcd pods and VM workers.
OpenShift Virtualization
OpenShift Virtualization Operator is an add-on to OCP that allows you to run and manage virtual machines alongside pods. HyperShift with the KubeVirt provider allows you to run guest cluster components using KubeVirt virtual machines.
MetalLB
Metallb is recommended as the network load balancer for bare-metal clusters.
Multicluster Engine
Multicluster engine (MCE) is one of the core components of HyperShift. Make sure to install 2.2.0+ in order to launch 4.12 guest clusters.
Cluster Manager
The local-cluster ManagedCluster allows the MCE components to treat the cluster it runs on as a host for guest clusters.
HyperShift
In the end, HyperShift operator can be launched by applying the following example yaml within the local cluster:
oc apply -f - <<EOF
apiVersion: addon.open-cluster-management.io/v1alpha1
kind: ManagedClusterAddOn
metadata:
name: hypershift-addon
namespace: local-cluster
spec:
installNamespace: open-cluster-management-agent-addon
EOF
At this point, you will have a nested OpenShift cluster running in the host cluster’s namespace, e.g., clusters-kv-00. Here is an example of all the worker nodes that were created, which are virtual machines from the host cluster point of view:
Here is an example that demonstrates that the guest cluster’s control plane run as pods in a specific namespace of the hosted cluster:
Step 3: Configuring the Nested OpenShift Cluster
If you want to enable the ODF (OpenShift Data Foundation service) for SAP Data Intelligence in order to enable the required block and object storage service, you can attach three disks to three virtual machines separately. When that is done, ODF can be installed based on local disks in the guest cluster.
To achieve that, you need to stop three virtual machines and attach a disk of the same size to each of them. Here is an example of a 30 GiB disk attached to one VM.
Installing SAP Data Intelligence
With your nested OpenShift cluster up and running thanks to HyperShift, it's time to bring SAP Data Intelligence into the picture.
Step 1: Deploying SAP Data Intelligence
Once the nested OpenShift cluster is successfully provisioned, HyperShift will provide you with access credentials, allowing you to interact with and manage the guest cluster. With this credential, you can perform SAP Data Intelligence installation by following the instruction of Red Hat Knowledge Base article, “Installing SAP Data Intelligence 3 on OpenShift Container Platform 4 supported by SDI Observer Operator”.
Here is an example of StatefulSets of the SAP Data Intelligence running on a guest cluster:
Step 2: Configuration and Testing
Configuration: Configure SAP Data Intelligence to connect with your data sources, repositories, and other external services. The detailed configuration instructions can be found in section 5.3 the Red Hat knowledge base article “Installing SAP Data Intelligence 3 on OpenShift Container Platform 4 supported by SDI Observer Operator”
Testing: Validate the integration by running data workflows, pipelines, and other tasks within the SAP Data Intelligence environment.
Here is an example of SAP Data Intelligence Modeler:
And another example of SAP Data Intelligence TensorFlow Serving Pipeline:
Conclusion
The synergy between SAP Data Intelligence, nested OpenShift, and HyperShift presents a remarkable opportunity for organizations to streamline the test environment of their data management and analysis processes. By following the procedures outlined above, you can harness the power of nested OpenShift clusters provisioned by HyperShift to create a versatile test environment for SAP Data Intelligence. This test environment empowers your teams to collaborate efficiently and test the installation and validation procedure more effectively. If you encounter any challenges during the setup process, don't hesitate to reach out to manjun_jiao.