Technology Blogs by SAP
Learn how to extend and personalize SAP applications. Follow the SAP technology blog for insights into SAP BTP, ABAP, SAP Analytics Cloud, SAP HANA, and more.
Showing results for 
Search instead for 
Did you mean: 
Active Participant
Let's assume you have to prepare machine learning model for classification or regression task.
All your data already in HANA, or in flat(csv) file.
Everything you need - (This library is an open-source research project and is not part of any official SAP products.)

This is joke, but hana_automl goes through all(not yet) AutoML steps and makes Data Science work easier.

This library based on python and made on top of other awesome libs:

  • hana_ml

  • Optuna

  • BayesianOptimization

  • Streamlit

For installation - you need just
pip3 install Cython
pip3 install hana_automl

After installation - it is quite easy to start:
from hana_automl.utils.scripts import setup_user
from hana_ml.dataframe import ConnectionContext

cc = ConnectionContext(address='address', user='user', password='password', port=39015)

# replace with credentials of user that will be created or granted a role to run PAL.
setup_user(connection_context=cc, username='user_new', password="password_new")

setup_user - is additional method if you need to create new user for experiments.

After that - you need fit/predict and waiting...
from hana_automl.automl import AutoML

model = AutoML(cc)
file_path='path to training dataset', # it may be HANA table/view, or pandas DataFrame
steps=10, # number of iterations
target='target', # column to predict
time_limit=120 # time limit in seconds

model.predict( file_path='path to test dataset', id_column='ID', verbose=1 )

You can find all documentation here -

Also, it is possible to run all this steps not from python, but from UI with help of streamlit

This UI looks like this:  Streamlit client

To start Ui you need 3 steps:

  1. Clone repository: git clone

  2. Install dependencies: pip3 install -r requirements.txt

  3. Run GUI: streamlit run ./

Ok, why you have to try?

Have a look on this example -

APL - is awesome, but with strong focus on speed, for more accurate models you need some time and PAL. So, hana_automl could help.

Also, it is possible to make not just simple model, but blending of models. To enable ensemble, just pass ensemble=True to function when creating AutoML model.

There is a big potential for improvement and contribution is very welcome!

If you have any ideas -

P.S. this is project of  @While-true-codeanything and @dan0nchik - very talented students...

Don't wait - have a try on your dataset and share your results...