Technology Blogs by SAP
Learn how to extend and personalize SAP applications. Follow the SAP technology blog for insights into SAP BTP, ABAP, SAP Analytics Cloud, SAP HANA, and more.
Showing results for 
Search instead for 
Did you mean: 
Product and Topic Expert
Product and Topic Expert
The SAP Data Intelligence Modeler uses a flow-based programming paradigm to create data processing pipelines (also known as graphs)

These pipelines are created through a series of operators, connected in sequence. However, the way your pipeline is modeled will have an effect on the performance. In this blog post, we're going to cover some very basic modeling principles you may want to keep in mind when creating your pipelines


For the purposes of this example, we're working with a very simplified pipeline


Our basic pipeline


Our Data Generator here simulates output from an IOT device - sending values for Temperature, CO2, Humidity, etc. These values are then run through a Multiplexer, which sends these values to two custom Go (Golang) operators


The Wiretap operator lets us view the values as they're passed through


Our Generated data


From there, the operators pull out the value they're concerned with (in this case, Temperature, and CO2), and pass these to a Terminal Output. In a real-life scenario, some action would be taken based on the values, however for simplicity in this example we're using Terminals to monitor the values


If you would like to follow along with this blog post, you can find both Before and After pipelines in this repository. When you create a new pipeline in the Data Intelligence Modeler, you can switch between Diagram and JSON view (as shown below) and copy the contents of the pipeline JSON from the repository


Switch to JSON view to maintain pipeline with code


Switching back to the Diagram editor, the first thing we're going to change is to simplify our pipeline by removing the Multiplexer. As the code inside our Go Operators is mostly identical, we're going to use Add Port to create a second Output Port on one of our Go Operators. This means not only that we don't need our Multiplexer, but that we only need to process the data once, and can get rid of our second Go Operator

Right click on our Go Operator, then select Add Port


From here, we have to define our new port. Enter a name (in this case, CO2), then make sure you select Output Port. Next, we have to define the Port Type. If we were sending just the values, we would might choose float64. However, in this case the values are accompanied by text, so we're using the string type


Add CO2 Output Port


Next, we want to delete the Multiplexer, and our extra Go Operator. Next, connect the Output Port of our Wiretap directly to our Go Operator, and connect the CO2 Output Port to our second Terminal. Then, press the auto-layout button to clean up the layout


A simplified pipeline


Next, we'll need to make the code changes to our Go Operator (renamed for clarity). First, select it, then click on the Script button to access the underlying code

Click on the Script button to edit


You'll want to add the two lines that deal with our CO2 Output Port, marked below with "ADD"

package main

import "strings"

var Temperature func(interface{})
var CO2 func(interface{}) //ADD

var values string

func main() {}

func Input(val interface{}) {
values := strings.Split(val.(string), ",")
Temperature("The temperature is " + values[2]) //Sends only Temperature
CO2("The CO2 level is " + values[4]) //Sends only CO2 | ADD


Now we can check that both values are output. Save and Run your pipeline, then use the Open UI button to check the output on each Terminal



Check Terminal Output


The readings are coming through


We've now verified that our simplified pipeline is working as expected, so we can get rid of our Wiretap, and connect the Data Generator directly to the Go Operator


Our final simplified pipeline


By replacing the Multiplexer and instead adding an Output Port to our Go Operator, we've managed to reduce the complexity of our pipeline and avoid code duplication. Again, if you would like to follow this blog yourself, Before and After pipelines are available in this repository. Special thanks go to my colleagues bengt and wei.han7 for their assistance with Data Intelligence Cloud


Of course, there are many more things to keep in mind when optimizing your pipelines - I plan to share more with you in the future. I hope this blog post has been useful, and I welcome any comments or questions below