Technology Blogs by SAP
Learn how to extend and personalize SAP applications. Follow the SAP technology blog for insights into SAP BTP, ABAP, SAP Analytics Cloud, SAP HANA, and more.
cancel
Showing results for 
Search instead for 
Did you mean: 
dvankempen
Product and Topic Expert
Product and Topic Expert








tahirhussain.babar from the SAP HANA Academy just released a new series of hands-on tutorial videos about the SAP Data Intelligence Machine Learning Scenario Manager.

In this blog post you will find the videos embedded with references and some additional information.

For more partner-focused hands-on developer series, see

To be informed when new posts are published about partner-related content in the video tutorials series, follow tag

Questions? Please post as comment.

Useful? Give us a like and share on social media.

Thanks!

 






This scenario was originally published by andreas.forster





Hands-On Video Tutorials


The SAP Data Intelligence ML Scenario Manager helps you to organize your data science artifacts and to manage your tasks in a central location. As a multi-faceted data science application, it is built around the key concept of the machine learning (ML) scenario, which may contain datasets, pipelines, and Jupyter notebooks.

In this article, you will find the videos embedded with some additional information and resources. Following along in the patented zero-to-hero format, you will be ready to use SAP Data Intelligence ML Scenario Manager and work with notebooks and pipelines with minimal effort and no time wasted.

The series covers both the data scientist persona performing data exploration and free-style data science activities using Jupyter as the data engineer working with pipelines for training and interference. Postman is used to call the API to retrieve the predictions.



What You Will Learn


You can watch the 9 video tutorials in 1 hour and a bit covering how you can use SAP Data Intelligence ML Scenario Manager to create, deploy, and consume, data science projects.

What you will learn is

  • Working with the Connection Manager and Semantic Data Lake connections

  • Data profiling with Metadata Manager

  • Working with ML Scenario Manager and Jupyter Notebooks

  • Data exploration using Pandas

  • Free-style data science (basic)

  • How to create and execute a pipeline

  • How to deploy pipelines

  • How to call the API using Postman



YouTube Playlist


To bookmark or directly access the playlist, go to



Introduction


Video Tutorial


In the first video, Tahir Hussain Babar (Bob) gives an overview of the series with references to the documentation.

https://youtu.be/M1AD9VOzo_o?list=PLkzo92owKnVzpG_ev3FflzojoX0-4C8rR

Markers


0:00 - Introduction

0:30 - Scenario

2:40 - Data set

3:00 - ML Scenario Manager

3:30 - Notebook (data scientist)

4:30 - Modeler (data engineer)

5:40 - Documentation

References


For the documentation on the SAP Help Portal, see


Connections and Metadata Explorer


Video Tutorial


In this video, we connect to the SAP Data Intelligence Launchpad, view connections with Connection Management and profile a data source using the Metadata Explorer. 

https://youtu.be/_MClxKmhT-A?list=PLkzo92owKnVzpG_ev3FflzojoX0-4C8rR

Markers


0:00 - Introduction

0:40 - SAP Data Intelligence Launchpad

1:15 - Connection Management

2:45 - Metadata Explorer

3:20 - Profile data set

5:45 - Monitoring

6:15 - Result


Creating Scenarios and Notebooks


Video Tutorial


The video provides a short overview ML Scenario Manager and main functionality, including working with Jupyter Notebooks.

https://youtu.be/5TFoA-keNOU?list=PLkzo92owKnVzpG_ev3FflzojoX0-4C8rR

Markers


0:00 - Introduction

1:00 - ML Scenario Manager

2:25 - Create scenario

3:50 - Create Notebook

4:15 - Working with Jupyter Notebook


Data Exploration Using Notebooks


Video Tutorial


The video shows how to connect to a dataset within our semantic data-lake within a Jupyter Notebook, and then shows how to perform basic data exploration.

https://youtu.be/U5YntgETnnw?list=PLkzo92owKnVzpG_ev3FflzojoX0-4C8rR

Markers


0:00 - Introduction

1:00 - Create Notebook

3:30 - Install Python libraries

4:00 - Data exploration using Pandas and matplotlib

Commands


The following commands were executed
# install library 
!pip install hdfs

# read data from SDL CSV file
from hdfs import InsecureClient
import pandas as pd
client = InsecureClient('http://datalake:50070')
with client.read('/shared/scenarios/marathon/RunningTimes.csv') as reader:
df_data = pd.read_csv(reader, delimiter=';')

# print first 5 rows of data set
df_data.head(5)

# define axis
x = df_data[["HALFMARATHON_MINUTES"]]
y_true = df_data["MARATHON_MINUTES"]

# print graph
%matplotlib inline
import matplotliv.pyplot as plot
plot.scatter(x, y_true, color = 'darkblue');
plot.xlabel("Minutes Half-Marathon");
plot.ylabel("Minutes Marathon");


Training Linear Regression Models


Video Tutorial


Scikit-learn is one of the most widely used Python packages for data science and machine learning (ML). The video shows how to perform linear regressions within a Jupyter Notebook using Scikit-learn.

https://youtu.be/mfdvtGNLHpI?list=PLkzo92owKnVzpG_ev3FflzojoX0-4C8rR

Markers


0:00 - Introduction

1:00 - Scikit-learn

2:00 - Linear regression

4:00 - RMSE

5:00 - Serialise into binary format using pickle

7:30 - Create Notebooks

8:00 - Load dat file using pickle

8:15 - Make prediction

Commands


The following commands were executed in Notebook 10
# install Scikit-learn
!install sklearn

# train regression model
from sklearn.linear_model import LinearRegression
lm = LinearRegression
lm.fit(x, y_true)

# create graph with prediction
plot.scatter(x, y_true, color = 'darkblue');
plot.plot(x, lm.predict(x), color = 'red');
plot.xlabel("Actual Minutes Half-Marathon");
plot.ylabel("Actual Minutes Marathon");

# calculate RMSE
import numpy as np
y_pred = lm.predict(x)
mse = np.mean((y_pred - y_true)**2)
rmse = np.sqrt(mse)
rmse = round(rmse, 2)
print("RMSE: ", str(rmse))
print("n: ", str(len(x)))

# serialize python object into binary format
import pickle
pickle.dump(lm, open("marathon_lm.pickle.dat", "wb"))

The following commands were executed in Notebook 20
# deserialize binary 
import pickle
lm_loaded = pickle.load(open("marathon_lm.pickle.dat", "rb"))

# make prediction
x_new = 120
predictions = lm_loaded.predict(([x_new]))
round(predictions[0], 2)


Model Training Pipeline Key Operators I


Video Tutorial


The video shows how to create graphical pipelines within the ML Scenario Manager. The initial pipeline will be used to train a model, and this video explains the key operators used in the pipeline.

https://youtu.be/ql3zftNU-FE?list=PLkzo92owKnVzpG_ev3FflzojoX0-4C8rR

Markers


0:00 - Introduction

2:20 - Create Python Producer pipeline

4:15 - Read File operator

5:25 - Python script


Model Training Pipeline Key Operators II


Video Tutorial


The video shows how to configure graphical pipelines that use Python Libraries with docker files within the ML Scenario Manager. We'll then look at executing the pipeline.

https://youtu.be/fCipyZHZjX8?list=PLkzo92owKnVzpG_ev3FflzojoX0-4C8rR

Markers


0:00 - Introduction

1:30 - Create Docker File

2:50 - Tags

3:30 - Build

4:00 - Configure Python operator

5:00 - Pipeline overview (continued)

7:40 - Execute pipeline

8:55 - Result

Commands


The following commands were included in the Docker file
FROM $com.sap.sles.base
RUN pip3.6 install --user numpy==1.16.4
RUN pip3.6 install --user pandas==0.24.0
RUN pip3.6 install --user sklearn
RUN pip3.6 install --user scikit-learn-intelex


Rest API Inference 


Video Tutorial


The video shows how to configure graphical pipelines, built from templates, to serve REST APIs.

https://youtu.be/XpnhsEbOyl0?list=PLkzo92owKnVzpG_ev3FflzojoX0-4C8rR

Markers


0:00 - Introduction

1:00 - Create Python Consumer pipeline

1:30 - Pipeline review

4:00 - Configure operator

5:50 - Tag group

6:40 - OpenAPI Server


Retrieving Predictions With Postman


Video Tutorial


The video shows how to retrieve predictions from a ML Model in SAP Data Intelligence using Postman.

https://youtu.be/cJ0r5tEKnEA?list=PLkzo92owKnVzpG_ev3FflzojoX0-4C8rR

Markers


0:00 - Introduction

1:30 - Deploy pipelines

3:00 - Create API request using Postman

4:50 - Authorization

5:30 - Headers

6:00 - Execute

6:20 - Recap


Learn More


SAP Community


To stay up to date, visit the topic area on the SAP Community. Follow the tag for notifications.

SAP Help Portal


For the documentation on the SAP Help Portal, see

Jupyter


For more information about Jupyter, visit



Share and Connect


Questions? Please post as comment.

Useful? Give us a like and share on social media.

Thanks!

If you would like to receive updates, connect with me on

For the author page of SAP PRESS, visit







Over the years, for the SAP HANA Academy, SAP’s Partner Innovation Lab, and à titre personnel, I have written a little over 300 posts here for the SAP Community. Some articles only reached a few readers. Others attracted quite a few more. For your reading pleasure and convenience, here is a curated list of posts which somehow managed to pass the 10k-view milestone and, as sign of current interest, still tickle the counters each month.