With the advent of the
SAP Data Hub, developer edition I was wondering whether I could connect it to my
SAP Data Hub Cockpit to execute and monitor my first Data Pipeline from there.
To start with, I
log into my SAP Data Hub, developer edition to determine the port numbers of its VORA Transaction Coordinator and Vora Tools installation:
tail /var/log/vora/vflow.log
Then I forward these two ports to my docker container:
docker run -ti --publish 2202:2202 --publish 2204:2204 --publish 8090:8090 --publish 9099:9099 --publish 9225:9225 --publish 50070:50070 --name datahub --hostname datahub --network dev-net datahub run --agree-to-sap-license --hdfs –zeppelin
With this I
reconfigure my SAP Data Hub Adapter to point to my SAP Data Hub, developer edition and create a VORA Data Pipeline Connection in my SAP Data Hub Cockpit. With the current version of the SAP Data Hub I do not seem able to use the DEFAULT Connection Configuration and therefore I chose MANUAL and enter the IP address and port number of my VORA vFlow API:
Based on this connection, I create a new Project in my SAP Data Hub Modelling tool:
Next, I add my Data Pipeline from the SAP Data Hub, developer edition tutorial to this Project as a Task:
To verify that I got the right Data Pipeline, I check its Graph. The visualisation is different from the SAP Data Hub – Data Pipelines modeller, but I still recognise my two Operators:
Subsequently, I integrate my Task into a new Task Workflow:
Finally, I execute my Task Workflow:
In my SAP Data Hub Dashboard, I can see that this triggered both my Task Workflow, as well as the embedded Data Pipeline. The already successfully Finished Task Workflow is from my previous blog
Create your first SAP Data Hub Task Workflow:
To verify that my Data Pipeline is in fact running as expected, I Open the UI of my Terminal Operator:
Since my Data Pipeline would run forever without intervention, I Stop it in the SAP Data Hub – Data Pipelines modeller:
As a result, I see that both my Task Workflow as well as my Data Pipeline have finished. In fact, my Data Pipeline is shown as successfully Completed whereas my Task Workflow is considered Finished with error, because of me manually cancelling my embedded Data Pipeline:
The details of this Error can be displayed in my SAP Data Hub Modelling tool, in this case it shows me that my embedded Data Pipeline has been aborted:
I hope this simple example provides you with a glimpse of the power that the integration between the different SAP Data Hub engines and tools provides.