Technology Blogs by SAP
Learn how to extend and personalize SAP applications. Follow the SAP technology blog for insights into SAP BTP, ABAP, SAP Analytics Cloud, SAP HANA, and more.
cancel
Showing results for 
Search instead for 
Did you mean: 
Sangeetha_K
Product and Topic Expert
Product and Topic Expert
This blog is part of technical resource for SAP TechEd session DT-200 : Amplify the Value of SAP Investments with Joint Reference Architectures
One of the major challenges businesses face is to figure out the best approach to combine the distributed data from internal and external sources. With the data federation-based architecture, businesses can bring together data from all the sources (without the need for duplication) to maximize value and draw insights from it

In our earlier blog , we saw how to federate queries from SAP Datasphere into Amazon Athena, thereby combining SAP & AWS data to build rich Analytical models for use with SAP Analytics Cloud dashboards.  

Now in this blog, we will see how this architecture applies in the other direction as well.  We will explain how Amazon Athena Federated queries can be used to seamlessly access SAP data by querying SAP Datasphere models (Illustrated by arrow 5 in the architecture diagram below.) 


 Bi-directional Data Federation  - arrows 4,5


This federated SAP data can now be combined with AWS data for dashboards or reporting applications in the AWS platform.  This architectural pattern is geared towards businesses that store massive amounts of data in AWS data lakes and can benefit from combining it with transactional data in SAP systems without the overhead of ETL processes. (For additional context, please read this AWS Big Data Blog) 

Solution Overview:


When a federated query is run, Amazon Athena identifies the parts of the query that should be routed to the data source connector and runs them with AWS Lambda. The data source connector makes the connection to the source, runs the query, and returns the results to Amazon Athena. If the data doesn’t fit into Lambda RAM runtime memory, it spills the data to Amazon S3 and is later accessed by Amazon Athena


Solution Overview Diagram


 

Steps for configuring query federation from Amazon Athena to SAP Datasphere Cloud:


Pre-requisites:



  • Access to an SAP Datasphere tenant deployed on AWS infrastructure.  

  • Views in SAP Datasphere ready with data and exposed for consumption. 

  • Space admin access in SAP Datasphere to create database user.

  • Appropriate access for your AWS IAM user to be able to create secret, create Amazon S3 bucket, create and deploy lambda function, and full access to Amazon Athena.


Step 1 : Create a secret for the SAP Datasphere instance using AWS Secrets Manager.




  • On the Secrets Manager console, choose Secrets.

  • Choose Store a new secret.

  • For Secret type, select Other types of secret.

  • Enter the credentials for the database user of the SAP Datasphere as Key/Value pair. (See this document for help on how to create the database user in SAP Datasphere.)

    • Key = username, Value = <DB Username>

    • Key = password, Value = <DB User password>





Secret creation




  • For Secret name, enter a name for your secret. (Tip: Use a name with sapdwc in it, so it’s easy to find.)

  • Leave the remaining fields at their defaults and choose Next.

  • Complete your secret creation.


Step 2: Create an S3 bucket and subfolder for Lambda to use.




  • On the Amazon S3 console, create a new S3 bucket and subfolder for Lambda to use, such as below:



Configuring S3 spill bucket


Step 3: Configure Amazon Athena federation with the SAP Datasphere :




  • On the Amazon Athena console, choose Data sources in the navigation pane.

  • Choose Create data source.

  • For Choose a data source, search for the SAP HANA connector and choose Next.



Data source creation




  • For Data source name, provide a name such as “sapdwc”

  • Under Connection details, choose Create Lambda function.



Data source configuration 




  • For Application name, keep the default AthenaSaphanaConnector.

  • For SecretNamePrefix, enter the name which you created earlier.

  • For  SpillBucket, enter your S3 bucket name created earlier.

  • For JDBCConnectorConfig, use the format saphana://jdbc:sap://<dwc Hana host>:443/?${<secret-name>}.

  • For LambdaFunctionName, enter a name such as AthenaSapdwcConnector

  • For LambdaMemory, use the default 3008.

  • For LambdaTimeout, use the default 900.

  • For SecurityGroupID, enter the security group ID that is associated to the VPC ID corresponding to your subnet (a security group with default configuration should work fine).

  • For SpillPrefix, create a folder under the S3 bucket you created and specify the name (for example, “saphana”).

  • For Subnetids, enter the corresponding subnet that the Lambda function can use to access your data source. For example: subnet1, subnet2.
    Please refer the Appendix section to learn more about network setting.

  • Select theI acknowledge check box.

  • Choose Deploy.


The deployment takes 3–4 minutes to complete.



 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Deployed AWS Lambda function seen in connection details




  • After the function is created, choose the function name under Lambda function and keep the other options at their default.

  • Choose Next.

  • Choose Create Data Source.


The newly created data source can now be viewed in Amazon Athena’s Data sources view.



Amazon Athena Data Sources Tab view


Step4: Run federated queries with Amazon Athena.




  • In Athena Query editor, select the data source we created from the dropdown 

  • Select the database from the dropdown.  

  • If errors appear after selecting the data source in previous step, use the troubleshooting approaches as mentioned in the appendix of this AWS Big Data Blog. 



  • Tables from SAP Datasphere will be listed here in the Tables view.  

  • You can run your queries against this table or do a preview of the table data from the context menu of the table.  

  • The below picture shows data from SAP Datasphere federated live into Amazon Athena  


 


View in SAP Datasphere



SAP Datasphere data federated in Amazon Athena


 

Summary:


We saw how Amazon Athena federated queries can help query SAP data in Amazon Athena opening up possibilities for combining SAP data with data in AWS S3 data lakes for richer analytical applications.  

Together with our SAP Datasphere data federation architecture, we now thus have the capability of bi-directional query federation between SAP and AWS.  

 

Additional Resources:


https://docs.aws.amazon.com/athena/latest/ug/connectors-sap-hana.html


https://aws.amazon.com/blogs/big-data/improve-federated-queries-with-predicate-pushdown-in-amazon-at...


https://github.com/awslabs/aws-athena-query-federation


 

Credits 


Many thanks to AWS team for their support and collaboration in validating this architecture - Yuva Athur, Ganesh Suryanarayanan, Krishnakumar Ramadoss, Sunny Patwari, Sabareesan Radhakrishnan and to Renga Sridharan and Scott Rigney for support and guidance. 

Thanks to SAP team members, for their contribution and collaboration for validating this architecture - Madankumar Pichamuthu, Karishma Kapur, Ran Bian, Sandesh Shinde, and to Sivakumar N, Anirban Majumdar for support and guidance. 

If you found this blog useful, please click the like button!  And if you have any feedback or questions, please post a comment below.  Or feel free to contact us directly atpaa@sap.com.    
1 Comment