
In my previous blog Bring data from SAP Datasphere to Snowflake, I introduced a method to bring data from SAP Datasphere tables and views to Snowflake.
In this blog, we will discuss how to access data from SAP Datasphere directly from Snowflake. This method's main benefit is its very simple architecture : there are no components needed between SAP Datasphere and Snowflake. However, query performance on Snowflake will be highly dependent on the resources available on the source system as well as network bandwidth. Therefore this method should not be used to transport large volumes of data on the network at query time.
Use External network access to access the SAP HANA Cloud database within SAP Datasphere directly.
Architecture of the integration of SAP Datasphere and Snowflake through External Network Access
In this blog post, we will explore how to access SAP Datasphere tables and views from Snowflake through a few simple steps :
Before we begin, make sure you have the following prerequisites in place:
These steps are already covered in my previous blog, Bring data from SAP Datasphere to Snowflake. Replicate the same steps to prepare data in your SAP Datasphere instance.
We will create a function in Snowflake which uses the SAP HANA JDBC driver to access the SAP HANA Cloud database in SAP Datasphere. In order to talk with SAP HANA Cloud securely, we will :
Start by downloading the latest version of the SAP HANA JDBC driver from the official website following this tutorial. Currently the latest version is 2.19.16.
Click on View All to show driver files
Once you downloaded the file on your desktop, upload it to a stage on Snowflake.
ngdbc-2.19.16.jar on an internal stage
The SAP HANA JDBC driver can now be used within a Snowflake function.
We will start by creating a network rule to authorize egress from Snowflake to the SAP HANA Cloud host.
-- Create a network rule
CREATE OR REPLACE NETWORK RULE external_database_network_rule_hana
TYPE = HOST_PORT
VALUE_LIST = ('xxxx-xxxx-xxxx-xxxx-xxxxx.hana.trial-us10.hanacloud.ondemand.com:443','xxxx-xxxx-xxxx-xxxx-xxxxx.hana.trial-us10.hanacloud.ondemand.com:443')
MODE= EGRESS;
Create a secret in order to store and re-use credentials of external systems. In this case, I created a secret for the SAP HANA Cloud user and password.
-- Create a Secret
CREATE OR REPLACE SECRET external_database_cred_hana
TYPE = password
USERNAME = 'user'
PASSWORD = 'password';
Let's then create an external access integration : it will be used by the function to get permission to access the external network locations and use the credentials specified by network rules and secrets.
-- Create an external integration
CREATE OR REPLACE EXTERNAL ACCESS INTEGRATION external_database_network_rule_ext_int_hana
ALLOWED_NETWORK_RULES = (external_database_network_rule_hana)
ALLOWED_AUTHENTICATION_SECRETS = (external_database_cred_hana)
ENABLED = true;
Finally, you can create a User Defined Function to talk with SAP HANA Cloud using the external access integration.
You separately set the secret parameter to the name of a secret included in the integration so that you have access to the secret’s contents from the handler code. An attempt to access a network location that is not specified by an allowed network rule will be denied. In this example, I used a Java UDF :
-- Create the function to talk with SAP HANA
CREATE OR REPLACE FUNCTION SAP_HANA(OPTION OBJECT, query STRING)
RETURNS TABLE(data OBJECT)
LANGUAGE JAVA
RUNTIME_VERSION = '11'
IMPORTS = ('@stage/ngdbc-2.19.16.jar')
EXTERNAL_ACCESS_INTEGRATIONS = (external_database_network_rule_ext_int_hana)
SECRETS = ('cred' = external_database_cred_hana )
HANDLER = 'JdbcDataReader'
AS $$
import java.sql.*;
import java.util.*;
import java.util.stream.Stream;
import com.snowflake.snowpark_java.types.SnowflakeSecrets;
public class JdbcDataReader {
public static class OutputRow {
public Map<String, String> data;
public OutputRow(Map<String, String> data) {
this.data = data;
}
}
public static Class getOutputClass() {
return OutputRow.class;
}
public Stream<OutputRow> process(Map<String, String> jdbcConfig, String query) {
String jdbcUrl = jdbcConfig.get("url");
String username;
String password;
if ("true".equals(jdbcConfig.get("use_secrets")))
{
SnowflakeSecrets sfSecrets = SnowflakeSecrets.newInstance();
var secret = sfSecrets.getUsernamePassword("cred");
username = secret.getUsername();
password = secret.getPassword();
}
else
{
username = jdbcConfig.get("username");
password = jdbcConfig.get("password");
}
try {
// Load the JDBC driver
Class.forName(jdbcConfig.get("driver"));
// Create a connection to the database
Connection connection = DriverManager.getConnection(jdbcUrl, username, password);
// Create a statement for executing SQL queries
Statement statement = connection.createStatement();
// Execute the query
ResultSet resultSet = statement.executeQuery(query);
// Get metadata about the result set
ResultSetMetaData metaData = resultSet.getMetaData();
// Create a list of column names
List<String> columnNames = new ArrayList<>();
int columnCount = metaData.getColumnCount();
for (int i = 1; i <= columnCount; i++) {
columnNames.add(metaData.getColumnName(i));
}
// Convert the ResultSet to a Stream of OutputRow objects
Stream<OutputRow> resultStream = Stream.generate(() -> {
try {
if (resultSet.next()) {
Map<String, String> rowMap = new HashMap<>();
for (String columnName : columnNames) {
String columnValue = resultSet.getString(columnName);
rowMap.put(columnName, columnValue);
}
return new OutputRow(rowMap);
} else {
// Close resources
resultSet.close();
statement.close();
connection.close();
return null;
}
} catch (SQLException e) {
e.printStackTrace();
return null;
}
}).takeWhile(Objects::nonNull);
return resultStream;
} catch (Exception e) {
e.printStackTrace();
Map<String, String> rowMap = new HashMap<>();
rowMap.put("ERROR",e.toString());
return Stream.of(new OutputRow(rowMap));
}
}
}
$$;
You can now call the function easily in SQL ! In this example, I created a java UDF called SAP_HANA() which takes 2 inputs :
- an object containing the SAP HANA Cloud driver and SAP HANA Cloud's JDBC URL.
- a string containing a valid SQL query on SAP HANA Cloud
-- Read from SAP_HANA
SELECT * FROM TABLE(SAP_HANA(
OBJECT_CONSTRUCT(
'driver','com.sap.db.jdbc.Driver',
'url','jdbc:sap://xxxx-xxxx-xxxx-xxxx-xxxxxxx.hana.trial-us10.hanacloud.ondemand.com:443/?encrypt=true&validateCertificate=false&user=user&password=password'
),
'SELECT * from SCHEMA.TABLE '));
The result of this SQL query is passed as a JSON to Snowflake:
Results of a federated query to SAP HANA Cloud
Now you can easily parse this JSON to merge this result with data stored in Snowflake.
In the last blog, we have seen a method to integrate SAP Datasphere views in Snowflake using any Replication tool. This integration empowers you to leverage the combined capabilities of both platforms and unlock advanced analytics and machine learning opportunities.
In this blog, we explored another option : federating data. The advantage of federation is that there is no infrastructure to maintain outside of SAP Datasphere and Snowflake. This method is easy to set up and perfect for ad-hoc querying between the 2 systems. However, it does not replace a replication pipeline. If you need to replicate data between SAP Datasphere and Snowflake, establishing a replication pipeline with a dedicated tool will offer better performance, scalability & monitoring capabilities.
Bringing SAP Datasphere data to Snowflake enables you to perform complex data analyses, build machine learning models, and gain valuable insights from your SAP data. Embrace the synergy of these robust tools to deliver successful data analytics.
Share any feedback or questions in the comments !
Maxime SIMON
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
User | Count |
---|---|
23 | |
23 | |
9 | |
5 | |
5 | |
5 | |
5 | |
5 | |
4 | |
4 |