cancel
Showing results for 
Search instead for 
Did you mean: 
Read only

SAP HANA python API - machine learning use cases

Former Member
0 Likes
1,976

Hi,

what are the typical use cases when I can deveolp my models in the Jupyter notebook with data in SAP HANA?

I am asking because most of the machine learning use cases I know have data in formats like wav, txt, csv or stored in data lake e.g. Hadoop or streaming data from IOT sensors. HANA memory is very expensive so it would make no sense to load this data to HANA.

Do you know any use cases or similar scenarios? Any links?

BR

Robert

View Entire Topic
AbdelhalimDadouche
Active Contributor

Hi mount_bertl

The SAP HANA Python API brings 2 major components, one is the SAP HANA DataFrame and the other is the access to the APL & PAL algorithm wrappers.

The SAP HANA DataFrame gives you access to your SAP HANA data and run transformation in the database instead of locally, you can apply transformations, aggregation and other functions at the database level instead of locally.

You can also collect the data use it like any Pandas data frame in the end with your preferred visualization or ML libraries.

And with the second, you can get access to the SAP HANA libraries for Machine Learning. SAP HANA provides access to 90+ "industry" standard algorithms like Linear Regression, K-mean, Apriori etc. but also to the Automated algorithm from KXEN (APL).
Not all algorithms have been wrapped in Python yet, but that's the ambition!

For the list of algorithms available from PAL please check: https://help.sap.com/doc/0172e3957b5946da85d3fde85ee8f33d/2.0.03/en-US/html/hana_ml.algorithms.pal.h...

For the list of algorithms available from PAL please check: https://help.sap.com/doc/0172e3957b5946da85d3fde85ee8f33d/2.0.03/en-US/html/hana_ml.algorithms.apl.h...

You can also check arun.godwin.patel blog series about the SAP HANA Python library:

- https://blogs.sap.com/2018/12/17/diving-into-the-hana-dataframe-python-integration-part-1/

- https://blogs.sap.com/2019/01/28/diving-into-the-hana-dataframe-python-integration-part-2/

You can also consider using SAP HANA, express edition which use a free developer license up to 32 GB of RAM. I personally ran some test loading csv files, and turned out that some of my 4GB of data files was loaded into a couple of hundred MB.

From what I remember, SAP HANA, express edition allows you to use SAP HANA streaming capabilities (to be confirmed however).

And last but not least with SAP HANA, express edition is that you can get the binary and install where ever you want or download a pre-built VM (assuming your host meet the minimum system requirement in both cases) or spin a new instance on AWS, Google Cloud or Microsoft Azure (the order here is just alphabetical, no preference is represented here ;-)).

Hope this helps you see better the benefits.

And off course this is definitely open to discussion

@bdel