The Transmission Control Protocol (TCP) is a widely used protocol that provides a reliable and ordered delivery of data between applications running on different hosts. It serves as the foundation for many technologies and plays a crucial role in modern IT infrastructure.
SAP Data Intelligence is a powerful platform that allows you to integrate various systems and data sources to gain insights and make informed decisions. Its custom Python operators offer even more flexibility, enabling you to connect to systems beyond what the standard shipped operators support.
In this blog post, we'll guide you through the steps to establish connectivity to an on-premises TCP-based system using Python in SAP Data Intelligence.
Access TCP-based resources:
Scenario: We aim to establish a connection with a TCP-based system using the Python operator in SAP Data Intelligence. Many databases but also other protocols like SFTP etc. are based on TCP. To achieve this, we will configure the cloud connector, which involves a series of steps. For the purposes of this demonstration, we will use an example of a "Hello TCP" server that is running on my local computer. The ultimate objective is to access this server from within SAP Data Intelligence using the Cloud Connector.
Before proceeding with the steps outlined in this guide, it is essential to have an instance of the SAP Cloud Connector installed. While it is possible to install the cloud connector on a server, for the purposes of this demonstration, we will be using a Windows machine. We recommend following the instructions provided in this blog (https://blogs.sap.com/2021/09/05/installation-and-configuration-of-sap-cloud-connector/) to install and configure the cloud connector.
The second requirement is a BTP subaccount with a Data Intelligence cluster. To this Subaccount we will connect the Cloud Connector.
The first step is to create a configuration in the Cloud Connector that connects to our subaccount and exposes the TCP resource. For the purpose of this demonstration, I ran a small Node.js server on my Windows machine that outputs "Hello TCP!".
telnet localhost 4444
Connecting to localhost ...
To create the configuration, navigate to the admin interface for the cloud connector and create a "Cloud to On-Premise" configuration.
Cloud Connector configuration
In the screenshot above, you can see that I exposed the internal host "localhost" with port 4444 via a virtual host called "virtualhost". This virtual host is the host that we will be requesting from the BTP side. Compared to HTTP resources, we do not need to explicitly expose paths.
BTP Cockpit Cloud Connector Resources
On the BTP end, we can check the cockpit and the connected cloud connectors in the respective menu tab. If you cannot see this tab, you may be missing some roles. It is important to note that we see the LocationID "FELIXLAPTOP", which is an identifier that distinguishes multiple cloud connectors connected to the same subaccount.
2. Creating a Data Intelligence Connection:
For our purposes, we do not want to hard-code the connection details, because we need a little help from the connection management to access the Connectivity Service of BTP. In the Connection Management application from SAP Data Intelligence we can create connections of all types. We create a connection of type HTTP with host, port and SAP Cloud Connector as the gateway.
Note: Not all connection types allow you to access via the Cloud Connector. See the official product documentation for details.
3. Developing a Custom Operator:
In the operators menu of Data Intelligence we create a new custom operator based on the Python3 operator.
Creating Custom Python Operator
We than change the configSchema.json of this operator to accept an HTTP Connection as a parameter. This file can be found in the repository under the following path.
Note: This is a bit hacky. Data Intelligence proxies the Connectivity Service from the BTP internally. We can reach it using the internal host provided in the connection details ("connectivity-proxy-service") of a HTTP Connection (because this connection supports the SAP Cloud Connector as a gateway - other connection types might work as well). It is important to notice that the behavior of this proxy is exactly the same as that of the Connectivity Service. In other environments, like Cloud Foundry, you can bind an instance of the Connectivity Service to your application and access the service that way. Data Intelligence also helps us with the authentication flow; in the connection details, we will find a valid JWT token to authenticate against the Connectivity Proxy.
In the Python Operator we will be using the sapcloudconnectorpythonsocket library I created for this purpose. This needs to be installed in a custom Dockerfile. The reason we cannot use standard librarys like PySocks to connect to the connectivity service, is the custom authentication flow used.
This post will not elaborate further on how to link the Dockerfile to the Custom Operator using tags.
RUN python3 -m pip --no-cache-dir install 'sapcloudconnectorpythonsocket' --user
And finally, the actual script looks straightforward:
from sapcloudconnectorpythonsocket import CloudConnectorSocket
First we take the connection details from the api.config.http_connection object. Then we hand those credentials to the library to connect a socket via the Cloud Connector proxy to our destination server. The logs reveal the various fields that are accessible.
One thing to keep in mind here. Instead of the standard HTTP proxy port of the connectivity service, we use the SOCKS5 proxies port. Looking at a regular service key of a connectivity service instance will make things clearer:
We find the classic XSUAA style clientid, clientsecret and token_service_url to retrieve a token in the client credential flow (Data Intelligence does that for us). Plus, one can also see the different ports exposed by the Connectivity Service.
This is why we specify the SOCKS5 port specifically in the opening of the socket.
With the successful opening of a socket, we are now able to send requests to the TCP server. In the example I am sending a request with an empty body, my local server returns "Hello Tcp!" to every request.
4. Testing the Custom Operator:
Finally we can put the new operator in a empty graph and fill the http_connection parameter. In this instance, I provide the Connection ID of the connection I created previously.
Python Operator Parameters
When executing the graph we can have a look at the console and see the following entries:
That means we were successful and the operator was able to request the local TCP server on my windows machine.
Hope you find the content of this blog helpful. Feel free to comment for further clarifications.