Technology Blogs by SAP
Learn how to extend and personalize SAP applications. Follow the SAP technology blog for insights into SAP BTP, ABAP, SAP Analytics Cloud, SAP HANA, and more.
cancel
Showing results for 
Search instead for 
Did you mean: 
MajidB
Product and Topic Expert
Product and Topic Expert

INTRODUCTION


 

SAP Data Hub provides an integration layer for data-driven processes across enterprise services to process and orchestrate data in the overall landscape. It offers an open, big-data centric architecture with OpenSource integration, cloud deployments, and third-party interfaces. It also leverages massive distributed processing and serverless computing capabilities.

In this blog, I will describe how to install SAP Data Hub on Microsoft Azure. In this article, I used the following version:

Azure Kubernetes Services (AKS): 1.10.9
SAP Data Hub: 2.3.174

 

CREATE THE KUBERNETES CLUSTER


 

1) Create a new resource AKS.



 

2) Fill all the needed information like the cluster name, the Kubernetes version etc etc.



Pick 4 standard D8s v3 nodes with 8CPU & 32GB memory. This sizing is the minimum requirements for the Data Hub installation (please confer the Data Hub installation guide).

 

3) Authentication

Enable RBAC and provide an existing Service Principal Name (SPN).



 

4) Networking

Be sure that the HTTP application routing is disabled and select the Virtual Network (VNET) and the Subnet (supposing that the VNET and the subnet have been created before).



 

5) Monitoring

Leave the monitoring to ON.



 

6) Validation

Once, all is validated, click on Download a template for automation.



 

Then Deploy.



 

 

After re-enter the necessary inputs like Resource Name etc etc, you will need your SPN Client ID and Client Secret.

Disable the Http-application routing, change the network plugin to kubenet and increase the number of max pod per node to at least 50.

Finally, the last step is to Purchase.



 

During the AKS deployment, Azure creates a new separate resource group (under the name MC_<name of the initial resource group>_<name of the AKS cluster>_<location>) with all the resources needed for the Kubernetes cluster.

In this step, we will need to associate the subnet in the routing table.



 

Select the subnet that you've entered during the AKS deployment.



 

 

CREATE THE AZURE CONTAINER REGISTRY


 

The SAP Data Hub installation require a docker registry. Azure provides a service for this named Azure Container Registry (ACR).



 

Be sure that the admin user is disabled.



 

STORAGE FOR THE VORA CHECKPOINT STORE


 

In order to enable SAP Vora Database streaming tables, checkpoint store needs to be enabled. The store is an object storage, you can either choose ADLS or WASB storage.

 

JUMP HOST SETUP FOR SAP DATA HUB DEPLOYMENT AND INSTALLATION


 

It is recommended to do the installation of SAP Data Hub from an external jump host. From the jump host, we will run the SAP Data Hub installation. The hardware requirements for the jump host can be:

  • OS: Red Hat Enterprise Linux 7.5,

  • CPU: 2 cores

  • Memory: 8GB

  • Diskspace: 100GB (HDD)


You will need to provide a SSH key during the deployment in order to be able to long in using a SSH client tool like Putty.

Be sure to deploy the jump host in the same subnet, as the AKS, so the AKS nodes will be directly reachable.

Once the jumps host is setup, follow the below instructions.

 

1) Install the AZ command line interface (AZ CLI).
rpm --import https://packages.microsoft.com/keys/microsoft.asc
sh -c 'echo -e "[azure-cli]\nname=Azure CLI\nbaseurl=https://packages.microsoft.com/yumrepos/azure-cli\nenabled=1\ngpgcheck=1\ngpgkey=https://packages.microsoft.com/keys/microsoft.asc" > /etc/yum.repos.d/azure-cli.repo'
yum install azure-cli
az login

 

2) Install docker
yum install docker
cd /usr/libexec/docker/
cp docker-runc-current /usr/bin/docker-runc
systemctl enable docker.service
systemctl start docker

 

3) Install the appropriate version of the kubectl
az aks install-cli --client-version 1.10.9

Get the AKS credentials
az aks get-credentials --resource-group YourRessourceGroup--name AKSName

Check the nodes
kubectl get nodes -o wide

 

4) Enable the Kubernetes dashboard usage

Create a yaml file rbac-dashboard.yaml with the following
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
name: kubernetes-dashboard
labels:
k8s-app: kubernetes-dashboard
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects:
- kind: ServiceAccount
name: kubernetes-dashboard
namespace: kube-system

And install it with the following command
kubectl create -f rbac-dashboard.yaml

 

5) Helm and Tiller

Create the namespace (sdh), where the SAP Data Hub will be installed
kubectl create namespace sdh

Create the yaml file helm-sdh.yaml for the service account:
apiVersion: v1
kind: ServiceAccount
metadata:
name: tiller
namespace: sdh

And install it with the following command
kubectl create -f helm-sdh.yaml

Create the cluster role bindings for the service accounts tiller and default in the namespace sdh
kubectl create clusterrolebinding tiller-cluster-rule --clusterrole=cluster-admin --serviceaccount=sdh:tiller
kubectl create clusterrolebinding vora-cluster-rule --clusterrole=cluster-admin --serviceaccount=sdh:default

Download and unpack helm version 2.9.1
wget https://storage.googleapis.com/kubernetes-helm/helm-v2.9.1-linux-amd64.tar.gz
tar -xvf helm-v2.9.1-linux-amd64.tar.gz
cp linux-amd64/helm /usr/bin/

Set the environment variables
export TILLER_NAMESPACE=sdh
export NAMESPACE=sdh

Helm initialization
helm init --service-account=tiller

Check if the tiller pod in the sdh namespace is running
kubectl get pods --namespace sdh | grep tiller

Check the helm readiness
helm ls

 

6) Login to Azure Container Registry (ACR)
az acr login -n YourACR

Set the environment variable
export DOCKER_REGISTRY=YourACR.azurecr.io

 

SAP DATA HUB INSTALLATION


 

The SAP Data Hub used in this article is 2.3.174. Once you downloaded from SAP Marketplace, upload it into the jump host previously created and setup. Unzip the file (must be SAPDataHub-2.3.174-Foundation.zip).

Finally run the installer
./install.sh

All the bold inputs were given during the installation:
Please enter the SAN (Subject Alternative Name) for the certificate, which must match the fully qualified domain name (FQDN) of the Kubernetes node to be accessed externally: yourFQDNFor SDHAcces
Please enter a username: YourUser
Do you want to use same system user password for YourUser user? (yes/no) yes
Do you want to configure security contexts for Hadoop/Kerberized Hadoop? (yes/no) no
Enable Vora checkpoint store? (yes/no) yes
Please provide the following parameters for Vora's checkpoint store
Please enter type of shared storage (s3/adl/wasb/gcs/webhdfs): wasb
Please enter WASB account name: ****************
Please enter WASB account key: ****************
Please enter WASB endpoint suffix (empty for default 'blob.core.windows.net'):
Please enter WASB endpoints protocol (empty for default 'https'):
Please enter connection timeout in seconds (empty for default 180):
Please enter WASB container and directory (in the form my-container/directory): sdh/
Do you want to validate the checkpoint store? (yes/no) yes

After a successful installation, you should get the following:
2018-11-09T18:37:21+0000 [INFO] Validating...
2018-11-09T18:37:21+0000 [INFO] Running validation for vora-cluster...OK!
2018-11-09T18:37:53+0000 [INFO] Running validation for vora-sparkonk8s...OK!
2018-11-09T18:38:51+0000 [INFO] Running validation for vora-vsystem...OK!
2018-11-09T18:38:57+0000 [INFO] Running validation for datahub-app-base-db...OK!
############ Ports for external connectivity ############
# vora-tx-coordinator-ext/tc port: 30852
# vora-tx-coordinator-ext/hana-wire port: 32564
# vora-textanalysis/textanalysis port: 31994
# vsystem/vsystem port: 32299
#########################################################
# You can find the generated X.509 keys/certificates under /mnt/resource/SAPDataHub-2.3.174-Foundation/logs/20181109_183430 for later use!
#########################################################
# Tenant created: "default"
# User: "YourUser"
# User for tx-coordinator: "default\YourUser"
#########################################################

Please note that the ports above are for my installation and SAP Data Hub will assign random ports.

 

ACCESSING THE SAP DATA HUB APPLICATION


 

The easiest way to access the SAP Data Hub is to assign an IP from the Azure Portal to one of the AKS node. The node that you assigned the IP and the one specified during the installation should be the same.

Once is done, you should be able to connect to your instance with the following yourFQDNFor SDHAcces:Ports. In my case, as seen aboce, my port is 32299.



 

ENABLE THE SAP HANA WIRE FOR SAP HANA SMART DATA ACCESS (SDA)


 

1) SAP Data Hub setup

There's a hana-wire functionality on the SAP Data Hub side, this functionality allows you to expose Vora tables via a SDA connection from a HANA database.

To expose the service in the network where the Kubernetes cluster runs, create a Kubernetes service of type LoadBalancer.

From the jump host, create a service of type LoadBalancer with the name vora-tx-coordinator-ext
kubectl -n $NAMESPACE expose service vora-tx-coordinator-ext --type LoadBalancer --name=vora-tx-coordinator-ext

Then, patch internal annotation to the load balancer
kubectl -n $NAMESPACE patch service vora-tx-coordinator-ext -p '{"metadata":{"annotations": {"service.beta.kubernetes.io/azure-load-balancer-internal":"true"}}}'

Run the following command to check the service
kubectl -n $NAMESPACE get service vora-tx-coordinator-ext -w

The hana-wire port is usually 3<instance number>15, so for the default SDH installation it's 30115.

 

2) HANA SDA setup

In the HANA Studio, under Provisioning → Remote Sources create a new remote source with the VORA (ODBC) adapter. Fill the usual SDA informations.



 

The following Extra Adapter Properties needs to be added
IGNORETOPOLOGY=0;encrypt=true;sslValidateCertificate=false;sslCryptoProvider=commoncrypto;sslKeyStore=WhereYourPSEFileIsSored;sslTrustStore=norelevant;

 

3) Virtual table creation

The virtual tables can be created via SQL command line or via the following, go on Provisioning→ Remote Sources → <source>  <user>, right-click the table, and choose Add as Virtual Table.



 

You can now access the SAP Data Hub Vora tables from your HANA studio.



 

Thanks for reading, hope it was useful.
6 Comments