tahirhussain.babar from the SAP HANA Academy just released a new series of hands-on tutorial videos about the SAP Data Intelligence Machine Learning Scenario Manager.
In this blog post you will find the videos embedded with references and some additional information.
For more partner-focused hands-on developer series, see
To be informed when new posts are published about partner-related content in the video tutorials series, follow tag
Questions? Please post as comment.
Useful? Give us a like and share on social media.
Thanks! |
This scenario was originally published by andreas.forster
|
Hands-On Video Tutorials
The SAP Data Intelligence ML Scenario Manager helps you to organize your data science artifacts and to manage your tasks in a central location. As a multi-faceted data science application, it is built around the key concept of the machine learning (ML) scenario, which may contain datasets, pipelines, and Jupyter notebooks.
In this article, you will find the videos embedded with some additional information and resources. Following along in the patented zero-to-hero format, you will be ready to use SAP Data Intelligence ML Scenario Manager and work with notebooks and pipelines with minimal effort and no time wasted.
The series covers both the data scientist persona performing data exploration and free-style data science activities using Jupyter as the data engineer working with pipelines for training and interference. Postman is used to call the API to retrieve the predictions.
What You Will Learn
You can watch the 9 video tutorials in 1 hour and a bit covering how you can use SAP Data Intelligence ML Scenario Manager to create, deploy, and consume, data science projects.
What you will learn is
- Working with the Connection Manager and Semantic Data Lake connections
- Data profiling with Metadata Manager
- Working with ML Scenario Manager and Jupyter Notebooks
- Data exploration using Pandas
- Free-style data science (basic)
- How to create and execute a pipeline
- How to deploy pipelines
- How to call the API using Postman
YouTube Playlist
To bookmark or directly access the playlist, go to
Introduction
Video Tutorial
In the first video, Tahir Hussain Babar (Bob) gives an overview of the series with references to the documentation.
https://youtu.be/M1AD9VOzo_o?list=PLkzo92owKnVzpG_ev3FflzojoX0-4C8rR
Markers
0:00 - Introduction
0:30 - Scenario
2:40 - Data set
3:00 - ML Scenario Manager
3:30 - Notebook (data scientist)
4:30 - Modeler (data engineer)
5:40 - Documentation
References
For the documentation on the SAP Help Portal, see
Connections and Metadata Explorer
Video Tutorial
In this video, we connect to the
SAP Data Intelligence Launchpad, view connections with Connection Management and profile a data source using the Metadata Explorer.
https://youtu.be/_MClxKmhT-A?list=PLkzo92owKnVzpG_ev3FflzojoX0-4C8rR
Markers
0:00 - Introduction
0:40 - SAP Data Intelligence Launchpad
1:15 - Connection Management
2:45 - Metadata Explorer
3:20 - Profile data set
5:45 - Monitoring
6:15 - Result
Creating Scenarios and Notebooks
Video Tutorial
The video provides a short overview ML Scenario Manager and main functionality, including working with Jupyter Notebooks.
https://youtu.be/5TFoA-keNOU?list=PLkzo92owKnVzpG_ev3FflzojoX0-4C8rR
Markers
0:00 - Introduction
1:00 - ML Scenario Manager
2:25 - Create scenario
3:50 - Create Notebook
4:15 - Working with Jupyter Notebook
Data Exploration Using Notebooks
Video Tutorial
The video shows how to connect to a dataset within our semantic data-lake within a Jupyter Notebook, and then shows how to perform basic data exploration.
https://youtu.be/U5YntgETnnw?list=PLkzo92owKnVzpG_ev3FflzojoX0-4C8rR
Markers
0:00 - Introduction
1:00 - Create Notebook
3:30 - Install Python libraries
4:00 - Data exploration using Pandas and matplotlib
Commands
The following commands were executed
# install library
!pip install hdfs
# read data from SDL CSV file
from hdfs import InsecureClient
import pandas as pd
client = InsecureClient('http://datalake:50070')
with client.read('/shared/scenarios/marathon/RunningTimes.csv') as reader:
df_data = pd.read_csv(reader, delimiter=';')
# print first 5 rows of data set
df_data.head(5)
# define axis
x = df_data[["HALFMARATHON_MINUTES"]]
y_true = df_data["MARATHON_MINUTES"]
# print graph
%matplotlib inline
import matplotliv.pyplot as plot
plot.scatter(x, y_true, color = 'darkblue');
plot.xlabel("Minutes Half-Marathon");
plot.ylabel("Minutes Marathon");
Training Linear Regression Models
Video Tutorial
Scikit-learn is one of the most widely used Python packages for data science and machine learning (ML). The video shows how to perform linear regressions within a Jupyter Notebook using Scikit-learn.
https://youtu.be/mfdvtGNLHpI?list=PLkzo92owKnVzpG_ev3FflzojoX0-4C8rR
Markers
0:00 - Introduction
1:00 - Scikit-learn
2:00 - Linear regression
4:00 - RMSE
5:00 - Serialise into binary format using pickle
7:30 - Create Notebooks
8:00 - Load dat file using pickle
8:15 - Make prediction
Commands
The following commands were executed in Notebook 10
# install Scikit-learn
!install sklearn
# train regression model
from sklearn.linear_model import LinearRegression
lm = LinearRegression
lm.fit(x, y_true)
# create graph with prediction
plot.scatter(x, y_true, color = 'darkblue');
plot.plot(x, lm.predict(x), color = 'red');
plot.xlabel("Actual Minutes Half-Marathon");
plot.ylabel("Actual Minutes Marathon");
# calculate RMSE
import numpy as np
y_pred = lm.predict(x)
mse = np.mean((y_pred - y_true)**2)
rmse = np.sqrt(mse)
rmse = round(rmse, 2)
print("RMSE: ", str(rmse))
print("n: ", str(len(x)))
# serialize python object into binary format
import pickle
pickle.dump(lm, open("marathon_lm.pickle.dat", "wb"))
The following commands were executed in Notebook 20
# deserialize binary
import pickle
lm_loaded = pickle.load(open("marathon_lm.pickle.dat", "rb"))
# make prediction
x_new = 120
predictions = lm_loaded.predict(([x_new]))
round(predictions[0], 2)
Model Training Pipeline Key Operators I
Video Tutorial
The video shows how to create graphical pipelines within the ML Scenario Manager. The initial pipeline will be used to train a model, and this video explains the key operators used in the pipeline.
https://youtu.be/ql3zftNU-FE?list=PLkzo92owKnVzpG_ev3FflzojoX0-4C8rR
Markers
0:00 - Introduction
2:20 - Create Python Producer pipeline
4:15 - Read File operator
5:25 - Python script
Model Training Pipeline Key Operators II
Video Tutorial
The video shows how to configure graphical pipelines that use Python Libraries with docker files within the ML Scenario Manager. We'll then look at executing the pipeline.
https://youtu.be/fCipyZHZjX8?list=PLkzo92owKnVzpG_ev3FflzojoX0-4C8rR
Markers
0:00 - Introduction
1:30 - Create Docker File
2:50 - Tags
3:30 - Build
4:00 - Configure Python operator
5:00 - Pipeline overview (continued)
7:40 - Execute pipeline
8:55 - Result
Commands
The following commands were included in the Docker file
FROM $com.sap.sles.base
RUN pip3.6 install --user numpy==1.16.4
RUN pip3.6 install --user pandas==0.24.0
RUN pip3.6 install --user sklearn
RUN pip3.6 install --user scikit-learn-intelex
Rest API Inference
Video Tutorial
The video shows how to configure graphical pipelines, built from templates, to serve REST APIs.
https://youtu.be/XpnhsEbOyl0?list=PLkzo92owKnVzpG_ev3FflzojoX0-4C8rR
Markers
0:00 - Introduction
1:00 - Create Python Consumer pipeline
1:30 - Pipeline review
4:00 - Configure operator
5:50 - Tag group
6:40 - OpenAPI Server
Retrieving Predictions With Postman
Video Tutorial
The video shows how to retrieve predictions from a ML Model in SAP Data Intelligence using Postman.
https://youtu.be/cJ0r5tEKnEA?list=PLkzo92owKnVzpG_ev3FflzojoX0-4C8rR
Markers
0:00 - Introduction
1:30 - Deploy pipelines
3:00 - Create API request using Postman
4:50 - Authorization
5:30 - Headers
6:00 - Execute
6:20 - Recap
Learn More
SAP Community
To stay up to date, visit the topic area on the SAP Community. Follow the tag for notifications.
SAP Help Portal
For the documentation on the SAP Help Portal, see
Jupyter
For more information about Jupyter, visit
Share and Connect
Questions? Please post as comment.
Useful? Give us a like and share on social media.
Thanks!
If you would like to receive updates, connect with me on
For the author page of SAP PRESS, visit
Over the years, for the SAP HANA Academy, SAP’s Partner Innovation Lab, and à titre personnel, I have written a little over 300 posts here for the SAP Community. Some articles only reached a few readers. Others attracted quite a few more. For your reading pleasure and convenience, here is a curated list of posts which somehow managed to pass the 10k-view milestone and, as sign of current interest, still tickle the counters each month.
|