Technology Blogs by SAP
Learn how to extend and personalize SAP applications. Follow the SAP technology blog for insights into SAP BTP, ABAP, SAP Analytics Cloud, SAP HANA, and more.
Showing results for 
Search instead for 
Did you mean: 
Product and Topic Expert
Product and Topic Expert
One way to speed up the installation of SAP Data Hub is to have a reusable installation host on AWS to use for the installation and for maintenance tasks. I have assembled the various installation steps into this blog post as a guide for building the installation host.  Using the steps below, you should have an environment that passes all the preflight checks done by the SAP Data Hub installer.

The installation guide contains the Installation Host Prerequisites listed at but the guide has all the steps in various links in the hierarchy, so I put them all together in one place.  Note that this is confirmed for 2.3.x and 2.4.x and should work for future releases of Data Hub - but always confirm with the current product documentation.

Checklist of software to install


  • Docker (minimum version 1.12.6) is installed and able to push to the internal registry.

  • Python 2.7 is installed

  • The Python YAML package (PyYAML) is installed.

  • The Kubernetes command-line tool, kubectl, is required, using one of the following versions:

    • 1.9.x

    • 1.10.x (greater or equal to 1.10.1)

    • 1.11.x

    kubectl must have access to the Kubernetes cluster

  • The Kubernetes package manager, Helm, is installed and properly configured, using one of the following versions:

    • 2.9.x - is the required version for AWS EKS

For AWS-specific installation, you'll also need

  • aws-iam-authenticator

  • aws-cli

Optional components (my personal recommendation and best practice)

  • unzip

  • screen for Linux

I'll explain the steps and provide commands for each of these items.  This will save you from going to each of the various websites and deciphering each installation individually.

First, provision an EC2 instance as your installation host/jumpbox

For this I will assume you have an AWS account and have appropriate permissions to create instances.

Login to the Amazon Console and navigate to EC2.   Make sure to install the EC2 instance in the same AWS region as you will install SAP Data Hub (to limit the cross-region networking costs).

Click the "Launch Instance" Button to start a new EC2 instance.  See highlighted areas to check Region and Launch Instance below:

The next step is to select the Amazon Machine Image (AMI) to Launch.  I chose Ubuntu server 18.04 64-bit because the software and package installers are easy to use.

Next, select the EC2 instance type for your jumpbox.  I chose t3.xlarge because it has 4cpu and 16GB of ram and 5 Gigabit networking which helps in the docker mirroring phase of the Data Hub installation. You can choose another instance type if you like or want to reduce costs.

Next configure your instance details.  The key change to make is on the storage of the jumpbox volume. The default storage on my instance was 8 GB which is not enough space to download the Installation files and docker images to the local machine.

The installation guide recommends "It has at least 10GB free disk space for SAP Data Hub installation folders and files, and has at least 20GB free disk space for used container images."

My suggestion is provision 100-200 GB to give yourself plenty of room for multiple installations and future upgrades.

Optional - It is a good idea to add tags to the instance for reference later.

Configure the security groups and make sure you have your keypair associated with the instance available on your local machine so you can login to the machine.

Once the instance is available, note the IP Address and login using SSH from your terminal program using your keypair file.  The IP address is located in the lower panel of the instances view of the AWS console.


The machine images on AWS typically have ec2-user as the system user, except for Ubuntu, they use "ubuntu".   Login to the terminal with the following:

ssh -i <keyname.pem> ubuntu@<ip address>

Note: I am on a MacBook, so I can use the pem key format.  If you're on windows you'll use PuTTY and have to convert the .pem keyfile to a .ppk.   There are lots of tutorials on AWS and StackOverflow that cover this topic.


Now we can install the software on the installation host

The first thing to do is assume root by typing:
sudo su

Install Docker


Run the following command to install docker
sudo apt-get update

sudo apt install

To confirm docker has installed, run:
docker --version

Install Python 2.7

Run the following command to install Python 2.7
sudo apt install python2.7 python-pip

To confirm Python 2.7 has installed check the version:
python --version

Install PyYAML

Run the following command to install PyYAML (Python has to be installed first)
sudo pip2 install pyyaml

Install kubectl


The documentation shows the following commands to install kubectl:
sudo apt-get update && sudo apt-get install -y apt-transport-https
curl -s | sudo apt-key add -
echo "deb kubernetes-xenial main" | sudo tee -a /etc/apt/sources.list.d/kubernetes.list
sudo apt-get update
sudo apt-get install -y kubectl

Confirm the installed version with
kubectl version

Unfortunately this installs the latest kubectl (1.14.1) which is not supported.  We have to roll it back to a supported version. After some research I found that 1.11.5 works, here's how to downgrade it.
sudo apt-get install -qy kubelet=1.11.5-00 kubectl=1.11.5-00 kubernetes-cni-0.6.0-00 --allow-downgrades

Verify the downgraded version
kubectl verison

Install helm

For installation on AWS EKS, only 2.9.x is supported .

Helm Source:

I create a downloads folder on my machine and dump all my software downloads in it.  Just change directories into that folder and run the following commands:

Unpack the tar.gz
tar -zxvf helm-v2.9.1-linux-amd64.tar.gz

Find the helm binary in the unpacked directory, and move it to its desired destination
cd linux-amd64
cp helm /usr/local/bin/helm

Verify the helm version
helm version

AWS specific installations:

Install aws-iam-authenticator


The aws-iam-authenticator allows your installation host to talk to the EKS cluster through kubectl (which we installed earlier)

I'm still in the /download folder, run the following commands to download and make the file executable.
curl -o aws-iam-authenticator
chmod +x ./aws-iam-authenticator

Copy the aws-iam-authenticator executable to your root user's path:
cp ./aws-iam-authenticator $HOME/bin/aws-iam-authenticator && export PATH=$HOME/bin:$PATH

Export the path to your profile so it loads every time you login
echo 'export PATH=$HOME/bin:$PATH' >> ~/.bashrc

If the root user doesn't have a /bin folder under the $HOME directory, just create it and run the commands again.


Install AWS Command Line Interface (awscli)

You'll need the AWS CLI to connect to your AWS instances and the kubernetes cluster.  The installation is a simple command:
sudo apt install awscli

Check the version:
aws --version

Unfortunately, the version installed is way too old for us.  1.14.x.  The minimum required version is 1.16.73.  

To upgrade the awscli, we need to install pip3 which works with Python 3.  The ubuntu image we chose already has Python 3 installed.  To confirm type:
python3 --version

To install pip3, just run:
sudo apt install python3-pip

Now we can update the awscli
pip3 install awscli --upgrade --user

Confirm the version again:
aws --version

We now have version 1.16.147 which is higher than the minimum version 1.16.73.


Optional Additional tools needed to install Data Hub 2.x

Now that we have the basics, you'll need to install unzip to unzip the Data Hub installation files.

Install unzip

Installing unzip is easy, just run:
sudo apt install unzip


Install screen

I also install screen for linux for the Data Hub installation because it allows you to disconnect from a terminal "screen" and let the docker image mirroring stage of the Data Hub Installation happen without worry of losing my connection. For more information reference this handy guide:

To install it run:
sudo apt install screen



Now you are prepared to run your Data Hub installation from your installation host.  With the tooling we've installed you should be ready to pass all your  pre-flight checks with ease.   I've used these steps on multiple implementations and I hope that this blog will save you time as you prepare for your own Data Hub Installation.