Technology Blogs by Members
Explore a vibrant mix of technical expertise, industry insights, and tech buzz in member blogs covering SAP products, technology, and events. Get in the mix!
cancel
Showing results for 
Search instead for 
Did you mean: 
nitesh_verma
Explorer
2,470

Why Python 


 

We have known that Smart Data Access (SDA) is much popular as a Data Provisioning method with SAP HANA to connect to various popular third party sources whether its SQL server, RDBMS etc. You can access data virtually as well as you can persist based on requirements. Flowgraphs plays an important role here.

But most of the times, in Real Life projects, we faced many requirements where there is a need to integrate HANA and do Data provisioning with the sources which are very specific and particular to any process/scenarios such as third party or custom APIs, Web pages, ODATA services, others. In these scenarios, we end up struggling to find ways on how to integrate HANA  with these sources as well as do the regular updates/processing of data.

Here, Python plays an important role as it brings lot of flexibility in integrating multiple sources along with powerful and easy to understand scripts. Also, provides lot of methods/functions to process your data.

 


Pre-requisites


 

In Python, we have multiple methods and so there is different need on what all things need to install.

Such as below

  • HANA client,

  • Python any version

  • Any development environment supported by Python such as PyCharm, Spyder, Jupyter Notebooks etc.

  • Source details depends upon from where you are loading the data. For example, if we are looking from APIs, then we need complete details like API urls, keys, parameters etc depend upon API nature.

  • HANA database with proper privileges


 

Methods in Python for HANA


 

To integrate different sources of data mentioned, Python provides various inbuilt methods for integration.

As per my experience, Some of them which are popular as mentioned below.
IMPORT DBAPI

IMPORT PYHDB

FROM SQLALCHEMY IMPORT CREATE_ENGINE

Here, in this blog, we will see these methods along with few example of short script. Also, I would share my experience with each methods.

To use all the methods, you need to install specific methods, if it is not there available by default under python dictionary.

 

DBAPI Method


 

For Using this method, we need to install HANA Client along with few changes with files and folder provided details below. Sample script below.
IMPORT DBAPI

This will import required methods while executing python scripts
CONNECTION = DBAPI.CONNECT(SERVER, PORT, USER, PASSWORD)

Here you need to provide HANA system details along with credentials, Try to have some background User and password which can be used for all jobs.
CURSOR = CONNECTION.CURSOR()

QUERY = INSERT INTO TABLE VALUES('INSERT TEST SUCCESSFUL ')

CURSOR. EXECUTE(QUERY)

Under Query variables, you can provide any SQL script which you would like to execute in HANA. you can use above statements to execute any DML/DDL statements.

After above line execution, you can check, data will be inserted successfully into HANA tables.

For more details, please follow below link.

https://blogs.sap.com/2012/06/08/sap-hana-and-python-yes-sir/

 

PYHDB Method


 

This method is very much similar to DBAPI, only difference which I found using PYHDB is, it has more robust mechanism and also it provides good integration with HANA DB.

Syntax looks similar as DBAPI like below.
IMPORT REQUEST

IMPORT PYHDB

This will import required methods while executing python scripts
CONNECTION = PYHDB.CONNECT(HOST ,PORT ,USER NAME ,PASSWORD)

Here you need to provide HANA system details along with credentials. Try to have some background User and password which can be used for all jobs.
CURSOR = CONNECTION.CURSOR()

QUERY = INSERT INTO TABLE VALUES('INSERT TEST SUCCESSFUL ')

CURSOR. EXECUTE(QUERY)

Under Query variables, you can provide any SQL script which you would like to execute in HANA. you can use above statements to execute any DML/DDL statements

After above line execution, you can check, data can be inserted successfully into HANA tables.

As per my experience, both the above methods has some limitation due to which we faced couple of challenges while inserting data into HANA. Also it does not have much flexibility to provide more details about HANA like schema name etc. To achieve this, we need to go for some more custom scripts.

Also, these methods include too much SQL statements to process and insert data into HANA along with issues with bulk loads.

For more details, please follow below link.

https://github.com/SAP/PyHDB

 

FROM SQLALCHEMY IMPORT CREATE_ENGINE Method


 

As per my experience, this is the most robust method and also provides good experience while using it for multiple scenarios. It also uses the PYHDB method, which makes it more integrated. Below are few of the reasons to use this method.

  • You can make use of python inbuilt methods like Pandas, Dataframe which is very much effective and useful while dealing with structured data and also supports good volume of data loading

  • You can reduce unnecessary scripting

  • Flexibility in defining more details about HANA system like Schema name etc.

  • Flexibility in dealing with data in-discrepancies


Sample script below
IMPORT PANDAS AS PD

IMPORT REQUESTS

IMPORT JSON

FROM SQLALCHEMY IMPORT CREATE_ENGINE

IMPORT PYHDB

All of the above are required ones for this method as each supports different purpose
RESPONSE = << API Details need to provide like keys etc and capture the response >>

Above statement is very specific to source details, as per source details you need to capture the response and process accordingly and store in any variable like below 'Table'
TABLE = RESPONSE.JSON()['TEST']

Converting the response in JSON format
DF = PD.DATAFRAME(TABLE)

Using Pandas, Dataframe to make the source data in more readable format
CONNECTION = CREATE_ENGINE('HANA+PYHDB://'<<USERNAME>>:<<PASSWORD>>@<<HOSTNAME>>:<<PORT>>)

Provide parameters same as applicable for above other methods
DF.TO_SQL('<<HANATABLENAME>>',CONNECTION,IF_EXISTS = 'APPEND',)

Passing Dataframe values to hana tables, here there is no need to write SQL like others, also gives you flexibility to define schemas and other details

For more details on syntax, please follow below link.

https://github.com/SAP/sqlalchemy-hana

https://stackoverflow.com/questions/44688029/moving-data-from-mysql-to-sap-hana-with-python

After above line execution, you can check, data can be inserted successfully into HANA tables.



 

Scheduling Python Scripts


 

In all Data integration scenarios, there is always a need to make the complete load process more robust and without manual intervention.

Python provides you flexibility to schedule its scripts using Windows task scheduler itself.

You can easily schedule any scripts using scheduler by creating Basic Task along with your program details, time etc and monitor also.



 

Summary


 

Python is very popular and powerful programming language. Using Python for Data Integration with HANA brings lot of flexibility to combine and extract information from various sources.

Its further enhancing the capability of SAP HANA and also gives you an environment to perform Advanced Analytics with the unexplored sources of data.

Currently, with SAP HANA 2.0, we are seeing lot of improvements in integration with python. This will be an added advantage with existing landscape.

 

 

Thanks & Regards,

Nitesh Verma
2 Comments
Labels in this area