Technology Blogs by Members
Explore a vibrant mix of technical expertise, industry insights, and tech buzz in member blogs covering SAP products, technology, and events. Get in the mix!
Showing results for 
Search instead for 
Did you mean: 
Active Contributor

Oil always works. Data doesn’t unless it is refined.

Usually, when we are on wheels, we keep an eye on the rear-view mirror and analyzing the vehicle behind is maintaining a distance or no cop car passing or following arbitrarily! Here these rear-view and side-view mirrors provide us information gathering in real-time for what happens in our back and sides. With this, we can adjust how we do things moving forward like changing lanes or keeping distance. Right!

Today, this form of business intelligence data gathering is simply too slow for any business house. What we need is the ability to find those patterns in our historical information and use it to forecast what will happen and we need to be able to do that in a faster way! Hence we need to pick up an 'Autonomous driving features' with Machine Learning capabilities for our real-time data analysis using highly accurate machine learning models. Perhaps this is one of the biggest efforts for Data and ML professionals who need to train these ML models to walk, run, and analyze with accuracy.

Let's see how we can achieve this in the fastest and easiest way...

Image source: Gifer

A few weeks ago, I described the ADF and Snowflake dataflow process. Today I would like to present a sample use case on "SAP sales and demand data forecasting using AutoML on Microsoft Power BI to visualize". A quick introduction is required on this topic before we jump to configuration straight!


Reiterating my opening statement - This blog is to demonstrate SAP BW Sales & Demand data insights or S/4HANA Line of Business[LoB] leveraging Azure code-free automated ML-powered by Azure Machine Learning on Power BI platform. Also, I am using a similar SAP table/dataset from my previous blog on Azure Data Factory and Snowflake.

The idea is to provide the workspace and platform where Data professionals and ML professionals can collaborate together to achieve the results. Data Scientists, data analysts, and Business analysts can independently work on AutoML that enables them to build machine learning models with clicks, not code, using just their Power BI skills. I got super excited after reading about Azure AutoML during the last few months and had a discussion with my data science evangelists, Data Engineers/Pros, and aspiring ML Engineers, this platform definitely gives them a free hand to play around with any dataset of choice coming from 'any' source whether its SAP, salesforce, snowflake, Oracle, etc., simply have to establish the connection link using "Thor's Bifrost" (Transportation Bridge). By the way, we are not inviting Thor or any demigods except data here to run predictions.

We know that Power BI offers a simple and powerful ETL tool that enables analysts to prepare data for further analytics. You invest significant effort in data cleansing and preparation, creating datasets that can be used across your organization. Now bring AutoML features is a game-changer from my perspective since it enables you to leverage the dataset for building machine learning models directly in Power BI without jumping or connecting on other ML platforms. The full lifecycle for creation, hosting, and deployment of the ML models is managed by Power BI, without any additional dependencies.

> One-stop-shop for all Data professionals!

A big question came to my mind while reading AutoML documentation, earlier -
"Will AutoML replace data scientists or ML Engineers?"
The answer was straight No!
While AutoMLs are good at building models, they are still not capable of doing most of a data scientist's job but offload some of the manual work and speed up the overall process and this is what today's business required to adopt.

Let's see the motivation, advantages, uses, and comparison of various tools on AutoML further.

Motivation for AutoML:

  • AutoML is filling the gap between “supply” and “demand” in the Data Science market. More companies nowadays either start collecting data, or they want to realize the potential of collected data: they want to get a value from it. On the other hand, there are no too many Data Scientists with a propper background to fulfill the demand, so the gap arises. AutoML could potentially fill this gap. AutoML will save the time of the Data Science team and AutoML outperforms the average Data Scientist.

Advantages of AutoML:

  • Implement ML solutions without extensive programming and platform knowledge.

  • Definitely saves a lot of time and resources

  • Leverage data science best practices available esp. choosing the ML models

  • Provides agile problem-solving

  • Power of ML accessible to everybody


When to use AutoML:
Classify, Regression, Forecasting, Clustering, Recommendation, Anomaly detection, Ranking

  • AutoML is a productivity-enhancing capability aimed at automating the ML learning code. This frees up time to focus on the broader aspects of the end-end ML application. That makes it a productivity tool for Data Scientists – not their replacement

  • Data scientists need an end-end platform to aid in the development of real-world ML solutions. This platform needs the capability to manage the lifecycle of ML models and should allow easy integration of AutoML productivity-enhancing tooling as well.

Machine learning as a service (MLaaS)
is an umbrella definition of various cloud-based platforms that cover most infrastructure issues such as data pre-processing, model training, and model evaluation, with further prediction. Prediction results can be bridged with your internal IT infrastructure through REST APIs.


MLaaS Comparsion:
This usually helps Data professionals and ML Engineers before they adopt any tool and gives them a detailed awareness of the tool's capabilities while picking up the right ML model and solution for their business needs.






ML Models

Azure ML Studio

Amazon AutoML

Cloud AutoML

Watson ML

  • Classification

Yes Yes Yes Yes

  • Regression

Yes Yes Yes Yes

  • Clustering

Yes Yes

  • Recommendation


  • Ranking

Yes Yes

  • Anomaly detection


  • Supported Framework

TensorFlow, Sci-kit-learn, MS cognitive toolkit, Spark ML TensorFlow, MXNet, Keras, Gluon, Pytorch, Caffe2, Chainer, Torch TensorFlow, Sci-kit-learn, XGBoost, Keras TensorFlow, Spark, Sci-kit-learn, XGBoost, IBM SPSS, PMML


source: Forbes whitepaper


Real-World Applications Providers of AutoML:

These tools are essential but strategy and the business requirement are vital.


Pros and Cons of AutoML Providers:

So far we understood the definition and types of AutoML providers market, let's see the pros and cons as well!


  • Low development and maintenance cost for running

  • No-code at all

  • Ease and accessibility of use

  • Cloud-enabled, low resource consumption

  • Implement ML solutions without extensive programming knowledge

  • Leverage data science best practices

  • Provide agile problem-solving

  • Featurization is available by default in AutoML


  • Non-optimal performance with flaws in datasets

  • Not suitable for very large & complex datasets or issues in datasets

  • Not a preferred solution for too small datasets.

  • Not so flexible for ML Experts

10-Steps to turn SAP Data into AutoML Model

This configuration runs with the following steps on Automated machine learning using dataflows that are hosted on Power BI Premium and Embedded capacities only.

  1. DATA Selection: choose SAP Sales Data on SAP BW or HANA for ML Analysis

  2. Create the workspace on the Power BI dashboard

  3. Create the Dataflow on the workspace

  4. Create the SAP Connection setting and validate

  5. Push tables into a dataset to feed the dataflow

  6. Create a dataflow with the input data

  7. Create and train a machine learning model

  8. Select the data fields for the ML Model

  9. Apply the model to a dataflow entity

  10. Using the scored output from the model in a Power BI Dashboard

Note, SAP Sales forecasting is typically split into two categories: quantitative and qualitative.
We'll use quantitative based on online and in-store sales forecasting in this ML analysis.
AutoML in Power BI integrates Automated ML from Azure Machine Learning to create your ML models. However, you don't need an Azure subscription to use AutoML in Power BI. The process of training and hosting the ML models is managed entirely by the Power BI service.

Source: Microsoft documentation


1. DATA Selection: choose SAP Sales Data on SAP BW or HANA for ML Model building

  • Direct Table Access - Select SAP Custom Table ZSAPSALESDATA with 100K Records on SAP BW 7.5 Application used by multiple BW infocubes for querying SAP Sales & Demand data.

  • Infocube Access - We can leverage SAP BW InfoCube since it's primarily used to store transaction data. They cannot store master data. For creating/defining summarized reports, you predominantly use SAP BW InfoCubes.

  • The other option, you can pick and choose VBS* Sales Tables from ERP or S/4HANA as well for dataset exaggerations here depending on SAP system availability.

  • On the current BW system - Run Transaction: SE16 to validate/view the header and data structure of the selected custom table.

  • You can leverage the Sales dataset provided on the Github link in the sources section.

  • Custom infoCube can be leveraged as well. In my case I used table directly but this option works well.


2. Create the workspace on the Power BI dashboard

  • Click on Left Panel "My workspace" > Create a workspace
    (as shown on the below screen)

  • Provide the name of a workspace - "SAP-Demo-Sales-Forecaster" and Description(Optional) And press the Save button


  • Ensure under the new workspace setting, enable premium capacity, and select small data storage format for demo purpose.


3. Create the Dataflow on the workspace 

  • Click on +New > Select "Dataflow"


  • Click on "Add new entities" under Define new entities (as shown below)



4. Create the SAP Connection setting and validate

  • Select "SAP BW Application Server" under Data Sources as shown below


  • Download and install On-premise data gateway software on your system before moving to the next step. (Link to download gateway software)

  • Fill SAP BW Application Server IP Details, SAP system number, SAP client ID along with Connection Credentials for on-premise gateway name (ensure on-prem gateway is up and running), fill up your SAP username and password, click on Next to proceed


5. Push tables into a dataset to feed the dataflow

  • Once the connection is established, a list of SAP BW Objects is visible.

  • Select the Infocube or Table ZSAPSALESDATA to fetch into a dataset.

  • Pick up Table ZSAPSALESDATA and click "Transform data" to proceed further with the analysis 


6. Create a dataflow with the input data

  • Save Query under the name: SAPSalesForecasterQuery

  • Save your dataflow name: sap sales forecaster intent

7. Create and train a machine learning model

  • The first step for creating our machine learning model is to identify the historical data including the outcome field that you want to predict. The model will be created by learning from this data.

  • Click on "Add a Machine learning Model"


  • Select the Entity and Outcome field - SalesChannel - Online | Offline

  • Select SalesChannel as the 'Outcome field' value and then select Next.


  • Choose a model - Binary Prediction, this model will provide the future insights for online sales prediction. In this case since we're predicting a binary outcome of whether a user will make a purchase online or offline(in-store), Binary Prediction is recommended.

  • Select "Online" in target outcome

  • Click on Next


8. Select the data fields for AutoML Model

  • Select the data fields for ML Model to study, by default some of the fields are selected based on data review by Azure AutoML alogrithm

  • Power BI does a preliminary scan of a sample of your data and suggests the inputs that may produce more accurate predictions. If Power BI doesn't recommend a field, an explanation would be provided next to it.


  • In the final step, we must provide a name for our model. Training Time is default for now. The longer you train your model, the more accurate the results.

  • You can choose to reduce the training time to see quick results or increase the amount of time spent in training to get the best model.

  • Click on the "Save and train" Button as shown below screenshot

  • The training process will begin by sampling and normalizing your historical data and splitting your dataset.


9. Apply the model to a Dataflow entity

  • Depending on the size of the dataset, the training process can take anywhere from a few minutes to the training time selected at the previous screen.

  • You can confirm that the model is being trained and validated through the status of the dataflow. This appears as a data refresh in progress in the Dataflows tab of the workspace.

  • Click on the "View Training Report" link as shown below.

  • Once the model training is completed, the dataflow displays an updated refresh time. You can confirm that the model is trained, by navigating to the Machine learning models tab in the dataflow. The model you created should show status as Trained and the Last Trained time should now be updated.


10. Using the scored output from the model in a Power BI report

  • To review the model validation report, in the Machine learning models tab, select the View training report button in the Actions column for the model. This report describes how your machine learning model is likely to perform.

  • You can use the Probability Threshold slicer on the Model Performance page to examine its influence on the Precision and Recall for the model.

  • Select the Apply model button at the top of the report to invoke this model. In the Apply dialog, you can specify the target entity that has the source data to which the model should be applied.

  • Click on Save and apply


  • To use the scored output from your machine learning model you can connect to your dataflow from the Power BI desktop, using the Dataflows connector.

  • The Prediction report for sap sales forecaster entity can now be used to incorporate the predictions from your model in Power BI reports for creating BI Apps for Data analysis.

  • You can save the report and use it again after the data refresh from SAP

You can share this AutoML generated report using the "share report" option.
So things finally looks good on dashboard



Thank you for reading!


Reference/Sources/further learning:

These are my personal opinions and thoughts. This does not represent any formal opinions, POVs, inputs, product road-maps, etc. from my current or past employers or partners or/and any Clients.

Labels in this area