SAP Data Intelligence ML Scenario Manager | Hands-...

dvankempen · ‎2022 Feb 16

tahirhussain.babar from the SAP HANA Academy just released a new series of hands-on tutorial videos about the SAP Data Intelligence Machine Learning Scenario Manager.

In this blog post you will find the videos embedded with references and some additional information.

For more partner-focused hands-on developer series, see

SAP BTP Developer Onboarding | Hands-on Video Tutorials

Getting Started with SAP HANA Cloud | Video Tutorials

Getting Started with Multitenant Business Applications on SAP BTP | Hands-on Video Tutorials

SAP Business Technology Platform Extension Generators | Hands-on Video Tutorials

Multitenant Business Applications with CAP | Hands-on Video Tutorials

DevOps with SAP Business Technology Platform | Hands-on Video Tutorials

SAP BTP Kyma Runtime Getting Started | Video Tutorial Series

SAP Graph | Hands-on Video Tutorials

SAP Data Intelligence ML Scenario Manager << this article

To be informed when new posts are published about partner-related content in the video tutorials series, follow tag

Partner Innovation Lab

Questions? Please post as comment.

Useful? Give us a like and share on social media.

Thanks!

This scenario was originally published by andreas.forster

SAP Data Intelligence: Create your first ML Scenario

Hands-On Video Tutorials

The SAP Data Intelligence ML Scenario Manager helps you to organize your data science artifacts and to manage your tasks in a central location. As a multi-faceted data science application, it is built around the key concept of the machine learning (ML) scenario, which may contain datasets, pipelines, and Jupyter notebooks.

In this article, you will find the videos embedded with some additional information and resources. Following along in the patented zero-to-hero format, you will be ready to use SAP Data Intelligence ML Scenario Manager and work with notebooks and pipelines with minimal effort and no time wasted.

The series covers both the data scientist persona performing data exploration and free-style data science activities using Jupyter as the data engineer working with pipelines for training and interference. Postman is used to call the API to retrieve the predictions.

What You Will Learn

You can watch the 9 video tutorials in 1 hour and a bit covering how you can use SAP Data Intelligence ML Scenario Manager to create, deploy, and consume, data science projects.

What you will learn is

Working with the Connection Manager and Semantic Data Lake connections

Data profiling with Metadata Manager

Working with ML Scenario Manager and Jupyter Notebooks

Data exploration using Pandas

Free-style data science (basic)

How to create and execute a pipeline

How to deploy pipelines

How to call the API using Postman

YouTube Playlist

To bookmark or directly access the playlist, go to

SAP Data Intelligence Machine Learning Scenario Manager

Introduction

Video Tutorial

In the first video, Tahir Hussain Babar (Bob) gives an overview of the series with references to the documentation.

https://youtu.be/M1AD9VOzo_o?list=PLkzo92owKnVzpG_ev3FflzojoX0-4C8rR

Markers

0:00 - Introduction

0:30 - Scenario

2:40 - Data set

3:00 - ML Scenario Manager

3:30 - Notebook (data scientist)

4:30 - Modeler (data engineer)

5:40 - Documentation

References

For the documentation on the SAP Help Portal, see

Machine Learning Guide

Connections and Metadata Explorer

Video Tutorial

In this video, we connect to the SAP Data Intelligence Launchpad, view connections with Connection Management and profile a data source using the Metadata Explorer.

https://youtu.be/_MClxKmhT-A?list=PLkzo92owKnVzpG_ev3FflzojoX0-4C8rR

Markers

0:00 - Introduction

0:40 - SAP Data Intelligence Launchpad

1:15 - Connection Management

2:45 - Metadata Explorer

3:20 - Profile data set

5:45 - Monitoring

6:15 - Result

Creating Scenarios and Notebooks

Video Tutorial

The video provides a short overview ML Scenario Manager and main functionality, including working with Jupyter Notebooks.

https://youtu.be/5TFoA-keNOU?list=PLkzo92owKnVzpG_ev3FflzojoX0-4C8rR

Markers

0:00 - Introduction

1:00 - ML Scenario Manager

2:25 - Create scenario

3:50 - Create Notebook

4:15 - Working with Jupyter Notebook

Data Exploration Using Notebooks

Video Tutorial

The video shows how to connect to a dataset within our semantic data-lake within a Jupyter Notebook, and then shows how to perform basic data exploration.

https://youtu.be/U5YntgETnnw?list=PLkzo92owKnVzpG_ev3FflzojoX0-4C8rR

Markers

0:00 - Introduction

1:00 - Create Notebook

3:30 - Install Python libraries

4:00 - Data exploration using Pandas and matplotlib

Commands

The following commands were executed

# install library 

!pip install hdfs



# read data from SDL CSV file

from hdfs import InsecureClient

import pandas as pd

client = InsecureClient('http://datalake:50070')

with client.read('/shared/scenarios/marathon/RunningTimes.csv') as reader:

  df_data = pd.read_csv(reader, delimiter=';')



# print first 5 rows of data set

df_data.head(5)



# define axis

x = df_data[["HALFMARATHON_MINUTES"]]

y_true = df_data["MARATHON_MINUTES"]



# print graph

%matplotlib inline

import matplotliv.pyplot as plot

plot.scatter(x, y_true, color = 'darkblue');

plot.xlabel("Minutes Half-Marathon");

plot.ylabel("Minutes Marathon");

Training Linear Regression Models

Video Tutorial

Scikit-learn is one of the most widely used Python packages for data science and machine learning (ML). The video shows how to perform linear regressions within a Jupyter Notebook using Scikit-learn.

https://youtu.be/mfdvtGNLHpI?list=PLkzo92owKnVzpG_ev3FflzojoX0-4C8rR

Markers

0:00 - Introduction

1:00 - Scikit-learn

2:00 - Linear regression

4:00 - RMSE

5:00 - Serialise into binary format using pickle

7:30 - Create Notebooks

8:00 - Load dat file using pickle

8:15 - Make prediction

Commands

The following commands were executed in Notebook 10

# install Scikit-learn

!install sklearn



# train regression model

from sklearn.linear_model import LinearRegression

lm = LinearRegression

lm.fit(x, y_true)



# create graph with prediction

plot.scatter(x, y_true, color = 'darkblue');

plot.plot(x, lm.predict(x), color = 'red');

plot.xlabel("Actual Minutes Half-Marathon");

plot.ylabel("Actual Minutes Marathon");



# calculate RMSE

import numpy as np

y_pred = lm.predict(x)

mse = np.mean((y_pred - y_true)**2)

rmse = np.sqrt(mse)

rmse = round(rmse, 2)

print("RMSE: ", str(rmse))

print("n: ", str(len(x)))



# serialize python object into binary format

import pickle

pickle.dump(lm, open("marathon_lm.pickle.dat", "wb"))

The following commands were executed in Notebook 20

# deserialize binary 

import pickle

lm_loaded = pickle.load(open("marathon_lm.pickle.dat", "rb"))



# make prediction 

x_new = 120

predictions = lm_loaded.predict(([x_new]))

round(predictions[0], 2)

Model Training Pipeline Key Operators I

Video Tutorial

The video shows how to create graphical pipelines within the ML Scenario Manager. The initial pipeline will be used to train a model, and this video explains the key operators used in the pipeline.

https://youtu.be/ql3zftNU-FE?list=PLkzo92owKnVzpG_ev3FflzojoX0-4C8rR

Markers

0:00 - Introduction

2:20 - Create Python Producer pipeline

4:15 - Read File operator

5:25 - Python script

Model Training Pipeline Key Operators II

Video Tutorial

The video shows how to configure graphical pipelines that use Python Libraries with docker files within the ML Scenario Manager. We'll then look at executing the pipeline.

https://youtu.be/fCipyZHZjX8?list=PLkzo92owKnVzpG_ev3FflzojoX0-4C8rR

Markers

0:00 - Introduction

1:30 - Create Docker File

2:50 - Tags

3:30 - Build

4:00 - Configure Python operator

5:00 - Pipeline overview (continued)

7:40 - Execute pipeline

8:55 - Result

Commands

The following commands were included in the Docker file

FROM $com.sap.sles.base

RUN pip3.6 install --user numpy==1.16.4

RUN pip3.6 install --user pandas==0.24.0

RUN pip3.6 install --user sklearn

RUN pip3.6 install --user scikit-learn-intelex

Rest API Inference

Video Tutorial

The video shows how to configure graphical pipelines, built from templates, to serve REST APIs.

https://youtu.be/XpnhsEbOyl0?list=PLkzo92owKnVzpG_ev3FflzojoX0-4C8rR

Markers

0:00 - Introduction

1:00 - Create Python Consumer pipeline

1:30 - Pipeline review

4:00 - Configure operator

5:50 - Tag group

6:40 - OpenAPI Server

Retrieving Predictions With Postman

Video Tutorial

The video shows how to retrieve predictions from a ML Model in SAP Data Intelligence using Postman.

https://youtu.be/cJ0r5tEKnEA?list=PLkzo92owKnVzpG_ev3FflzojoX0-4C8rR

Markers

0:00 - Introduction

1:30 - Deploy pipelines

3:00 - Create API request using Postman

4:50 - Authorization

5:30 - Headers

6:00 - Execute

6:20 - Recap

Learn More

SAP Community

To stay up to date, visit the topic area on the SAP Community. Follow the tag for notifications.

SAP Data Intelligence Cloud

SAP Help Portal

For the documentation on the SAP Help Portal, see

help.sap.com/viewer/product/SAP_DATA_INTELLIGENCE/Cloud/en-US

Jupyter

For more information about Jupyter, visit

jupyter.org

Share and Connect

Questions? Please post as comment.

Useful? Give us a like and share on social media.

Thanks!

If you would like to receive updates, connect with me on

LinkedIn > linkedin.com/in/dvankempen

Twitter > @dvankempen

For the author page of SAP PRESS, visit

Denys van Kempen

Over the years, for the SAP HANA Academy, SAP’s Partner Innovation Lab, and à titre personnel, I have written a little over 300 posts here for the SAP Community. Some articles only reached a few readers. Others attracted quite a few more. For your reading pleasure and convenience, here is a curated list of posts which somehow managed to pass the 10k-view milestone and, as sign of current interest, still tickle the counters each month.

Good Reads (my two cents)