Technology Blogs by Members
Explore a vibrant mix of technical expertise, industry insights, and tech buzz in member blogs covering SAP products, technology, and events. Get in the mix!
cancel
Showing results for 
Search instead for 
Did you mean: 
architectSAP
Active Contributor
2,122
Based on my last SAP Data Hub blogs, I got a landscape in place to schedule a more complex Data Pipeline:

My Data Pipeline looks as follows:



In this Data Pipeline, I copy the test.csv file from my previous blog Create your first SAP Data Hub Task Workflow from the Hadoop cluster where my SAP Data Hub Adapter resides to the Hadoop cluster of the SAP Data Hub, developer edition and load it in parallel into the Vora table that I created when Validate the SAP Vora Installation. You find the JSON file at the end of this blog.

Subsequently I add this as a Task to my Backup Task Workflow:



Finally, I add my Backup Task Workflow into a Schedule to run Every Hour:



As a result, I got 1 Active Schedule and eventually my scheduled Task Workflow starts running with its embedded Data Pipeline:



The Data Pipeline completes first:



With the Data Pipeline completed, the Task Workflow finishes as well:



To check one of the results, I Preview my VORA Table:



Of course, usually I would not watch this but let my data processes run and monitor them from my Monitoring Dashboard:



I hope this gives you an idea how powerful the SAP Data Hub is in managing big data flows.

By the way, this is my Data Pipeline in JSON format:
{
"description": "",
"processes": {
"hdfsconsumer1": {
"component": "com.sap.storage.hdfs.consumer",
"metadata": {
"label": "HDFS Consumer",
"x": 16,
"y": 72,
"height": 80,
"width": 120,
"config": {
"hadoopNamenode": "linux-p2i7:8020",
"path": "/tmp/test.csv"
}
}
},
"hdfsproducer1": {
"component": "com.sap.storage.hdfs.producer",
"metadata": {
"label": "HDFS Producer",
"x": 199.99999904632568,
"y": 132,
"height": 80,
"width": 120,
"config": {
"hadoopNamenode": "linux-u7wu:9000",
"path": "/tmp/copy.csv",
"append": false
}
}
},
"sapvorahdfsloader1": {
"component": "com.sap.vora.hdfsLoader",
"metadata": {
"label": "SAP Vora HdfsLoader",
"x": 199.99999904632568,
"y": 12,
"height": 80,
"width": 120,
"config": {
"dsn": "v2://linux-u7wu:2202/?binary=true",
"hadoopNamenode": "linux-p2i7:8020",
"initStatements": "",
"tableName": "TABLE001"
}
}
},
"synchronizer1": {
"component": "experimental.util.synchronizer",
"metadata": {
"label": "Synchronizer",
"x": 383.99999809265137,
"y": 72,
"height": 80,
"width": 120,
"config": {}
}
},
"graphterminator1": {
"component": "com.sap.util.graphTerminator",
"metadata": {
"label": "Graph Terminator",
"x": 567.999997138977,
"y": 72,
"height": 80,
"width": 120,
"config": {}
}
}
},
"groups": [],
"connections": [
{
"metadata": {
"points": "140,121 167.99999952316284,121 167.99999952316284,172 195.99999904632568,172"
},
"src": {
"port": "outFile",
"process": "hdfsconsumer1"
},
"tgt": {
"port": "inFile",
"process": "hdfsproducer1"
}
},
{
"metadata": {
"points": "140,103 167.99999952316284,103 167.99999952316284,52 195.99999904632568,52"
},
"src": {
"port": "outFilename",
"process": "hdfsconsumer1"
},
"tgt": {
"port": "inhdfsfilename",
"process": "sapvorahdfsloader1"
}
},
{
"metadata": {
"points": "323.9999990463257,52 351.9999985694885,52 351.9999985694885,103 379.99999809265137,103"
},
"src": {
"port": "outresult",
"process": "sapvorahdfsloader1"
},
"tgt": {
"port": "in1",
"process": "synchronizer1"
}
},
{
"metadata": {
"points": "323.9999990463257,172 351.9999985694885,172 351.9999985694885,121 379.99999809265137,121"
},
"src": {
"port": "outFilename",
"process": "hdfsproducer1"
},
"tgt": {
"port": "in2",
"process": "synchronizer1"
}
},
{
"metadata": {
"points": "507.99999809265137,103 535.9999976158142,103 535.9999976158142,112 563.999997138977,112"
},
"src": {
"port": "out1",
"process": "synchronizer1"
},
"tgt": {
"port": "stop",
"process": "graphterminator1"
}
}
],
"inports": {},
"outports": {},
"properties": {}
}
Labels in this area