Technology Blogs by SAP
Learn how to extend and personalize SAP applications. Follow the SAP technology blog for insights into SAP BTP, ABAP, SAP Analytics Cloud, SAP HANA, and more.
cancel
Showing results for 
Search instead for 
Did you mean: 
Ian_Henry
Product and Topic Expert
Product and Topic Expert
SAP HANA customers are often requesting the integration of SAP HANA data with data stored in AWS S3 object storage.  In this blog I am relaying some work of an esteemed colleague. We will describe the easiest possible integration with zero coding effort by simply using the AWS Storage Gateway product.

The main steps covered here include

  1. AWS Storage Gateway Overview

  2. Configure AWS Storage Gateway via the AWS Console

  3. Mount S3 File System

  4. Configure HANA Smart Data Integration (SDI) File Adapter to S3



AWS Storage Gateway Overview


What is the AWS Storage Gateway?

  • A gateway to connect on-premise IT environments with cloud-based storage

  • Provides file-based, volume-based and tape-based storage solution


– File Gateway






    • Supports a file interface into Amazon S3

    • Stores and retrieves objects in S3 using NFS or SMB protocols

    • Easy to access a file system mount on S3




– Volume Gateway






    • Provides cloud backed storage using iSCSI protocol

    • Cache volumes and Stored volumes configurations




– Tape Gateway






    • Archives durable data in AWS Glacier

    • Provides a virtual tape infrastructure








 

Configure AWS Storage Gateway via the AWS Console (13 steps)


1: Console / Storage Section / Storage Gateway



2: Select gateway type: File



3: Select gateway platform



4: Select EC2 instance type (M4, M5, C4, C5, R3, I3), xlarge or 2xlarge




 

5: Configure instance with right resources - VPC, CPU allocation…




 

6: Add Storage, Min 150GB for Cache



7: Configure Security Group (Open NFS but whitelist the IPs able to connect)



8: Wait for the EC2 instance to be created and get its IP from the AWS Console



9: Go to the first screen where Storage Gateway was created and click next, type the IP from previous step



10: Give the gateway a name and activate it



11: Map a local disk for Gateway Cache



12: Create the file share, mapping the S3 bucket (already created) to the Gateway name



13: Click next, wait until the file share is created and record mount instructions


HANA Smart Data Integration (SDI) File Adapter



  • Use existing File Adapter via AWS Gateway

  • Virtualize data access, can federate or copy data into HANA

  • Read S3 data and Write S3 data too

  • No code required


 File Adapter Capabilities



  • Reads formatted files and free-form text files

    • CSV and fixed width text files

    • XML



  • File systems

    • Local file systems

    • SharePoint source only

    • HDFS target only



  • DML Support

    • Select

    • Insert



  • Shared data access must be mounted as a directory under the DP Agent root installation directory


Mount the S3 File Gateway


DP Agent Root Installation on our setup this is:
/data/sap/dataprovagent

As root user

– Create a mount point, here we use s3
mkdir /data/sap/dataprovagent/s3

– Mount AWS Storage Gateway, created by the steps above from the file share gateway
mount -t nfs -o nolock,hard IP_OF_AWS_SG:/GATEWAY_NAME /data/sap/dataprovagent/s3

– If the S3 bucket contains files, check these are visible
ls –al /data/sap/dataprovagent/s3


SDI & Amazon S3 "greetings" example


File Adapter Config File - greetings.cfg



Here we can see  some sample files stored in S3, but now available via NFS


Create a Remote Source using the SDI File Adapter


In HANA Studio we will use the following parameters

  • The Root Directory indicates where to read the files from, here that is s3


Directory of the file format definitions indicates where the configuration files are (greetings.cfg)

SDI Virtual Table Access To S3 Files


Here is a simple example to access S3 files via greetings virtual table



Read the S3 Files in HANA (via virtual table)



Write to S3 files from SQL (via virtual table)
insert into "SYSTEM"."FileAdapter_greetings" values ('/data/sap/dataprovagent/s3','demo_4.csv',1, NULL, 8, 110, 'merhaba')



Validate the data in S3 (via the AWS Console)



We can also validate the files have been created using the command line



And finally validate the results from our greetings virtual table in HANA


Conclusion


In this blog we have seen how we can integrate SAP HANA data to Amazon S3 storage using the AWS Storage Gateway via the HANA Smart Data Integration File Adapter.
23 Comments
dvankempen
Product and Topic Expert
Product and Topic Expert
Thanks for sharing, Ian
ailang
Explorer
0 Kudos
Simple and neat approach, thank you for sharing.

I assume we can deploy the same (AWS Gateway) for OpenHub as well? Mounting S3 as a file system for open-hub to write files to?
former_member618571
Discoverer
0 Kudos
Pretty Informative. Ian is this supposed to work bidirectionally. Can we push data from HANA views or tables data to S3
Ian_Henry
Product and Topic Expert
Product and Topic Expert
Hi Mahesh,

Exactly, you can write(insert) from HANA to a virtual table that is a file on S3.
Ian_Henry
Product and Topic Expert
Product and Topic Expert
0 Kudos
Yes, in theory that should also work, but I have not tried this approach.
RajeshSanikommu
Discoverer
0 Kudos
Great! Our team looking for exact solution (write or insert from HANA to a virtual table that is a file on S3), by any chance could you refer any documentation or blog.

I really appreciate your help Ian Henry.
former_member480631
Discoverer
0 Kudos

Its great blog – can you provide some supporting document for HANA to S3 and please confirm in DML will it support Delete as well?

Ian_Henry
Product and Topic Expert
Product and Topic Expert
0 Kudos
The integration is through the File Adapter.  We inherit the capabilities this provides.

The capabilities can be seen at the link below. It shows we can select and insert but not delete or update.

https://help.sap.com/viewer/7952ef28a6914997abc01745fef1b607/2.0_SPS00/en-US/d430dc8f21e84853a3fe393...

 
0 Kudos
Hi Experts,

Is there any option to connect ABAP Rest API directly to Amazon S3 bucket. Currently there is no SAP PI/PO or SAP Cloud Platform. Is there any way to achieve this direct connection from S/4 HANA to AWS S3 using ABAP rest API alone without any other interface in middle? Are requirement is to transfer PDF and CSV files to AWS S3.
PaulRR
Participant
0 Kudos
Hi Ian,

Thanks for the useful guide. I have 2 issues:

1) I have configured 2 LINUX EC2 instances to mount to an s3 bucket. They can each read/write files in the s3 bucket, they can see files that each has written, but they can't see files (with 'public' permissions) that I place into the s3 bucket via the website interface

2) I have set up 2 Windows Server 2019 EC2 instances, but when I run this command:
mount -o nolock -o mtype=hard <IP>:/s3bucket-test S:

They both show the reply
Network Error - 53

Type 'NET HELPMSG 53' for more information.

C:\Users\Administrator>net helpmsg 53

The network path was not found.

 

Any ideas?
0 Kudos
Thanks Ian for the blog.

I was wondering if SAP plans to release an S3 connector/adapter as a part of Data Provisioning Adapters?

 

Any ideas?

Regards

Sonal
former_member677803
Discoverer
0 Kudos
Excellent article, congratulations.

I have some questions :

  • why we need a EC2 instance?

  • can I use this solution to replicate sap hana tables to S3 in NRT?. 

  • would I have to do any additional configuration for changes to a hana table to be replicated atomatically?

former_member62727
Discoverer
0 Kudos
 

Good blog. I am just wondering where is the AWS gateway is running here, on-premise, or in AWS.  If on-premise why the instructions shows its EC2 not any other type of instance like Hyper-V any reason?
0 Kudos
amazing,great work
Farid
Active Participant
0 Kudos
Fantastic blog.

Especially the no code aspect.

I managed to make an end to end test in both direction with test files.

Now I created a configuration file and a file adapter with a virtual table teste based on the standard SAP table VBAK.

I would like to try to export the content of VBAK to the virtual table and from there to S3, possibly without doing manually an insert for each table entry.

And I am strugling with SQL : trying to insert the content of VBAK into teste

INSERT INTO SAPHANADB."teste" WITH w1 AS (SELECT * FROM SAPHANADB."VBAK" ) SELECT * FROM w1;

Could not execute 'INSERT INTO SAPHANADB."teste" WITH w1 AS (SELECT * FROM SAPHANADB."VBAK" ) SELECT * FROM w1'
SAP DBTech JDBC: [270]: not enough values: DP_CLEAR_STA_HDR

It doesn't work as there are 5 additional fields in the virtual tables : Path, name, rownmumber, error, part

Any SQL guru who could help me ?

werner_daehn
Active Contributor
0 Kudos
Out of these five columns, the latter three are read-only.

But you problem is a different one: You do not specify a column list and hence the select statement has to match the insert-table structure perfectly - which it does not for obvious reasons.

Your SQL must be in the form of
insert into saphanadb.teste (PATH, NAME, MANDT, VBELN, NETWR)
select 'dir1', 'data.csv', MANDT, VBELN, NETWR from saphanadb.vbak;
Farid
Active Participant
0 Kudos
Hello Werner,

Your response unlocked me ! Thank you very much.

I had to make some adjustments to my .cfg files because of columns format incompatibles between table vbak and virtual table teste.

But the SQL command works.

 

insert into saphanadb."teste" (PATH, NAME, MANDT, VBELN)
select '/usr/sap/dataprovagent/s3', 'demo_6.csv', MANDT, VBELN from saphanadb.vbak

Now I have a numeric overflow error, the only way to bypass the error is just to select the "mandt" column ... And and I am working on a very small database, SAPHANADB:VBAK contains only 3334 records. Do you have any hints on how to also select the other columns ?

Could not execute 'insert into saphanadb."teste" (PATH, NAME, MANDT, VBELN, erdat) select '/usr/sap/dataprovagent/s3', ...' in 79 ms 218 µs .
[314]: numeric overflow: 8000000060 (at pos 122)

Could not execute 'insert into saphanadb."teste" (PATH, NAME, MANDT, VBELN) select '/usr/sap/dataprovagent/s3', ...' in 136 ms 23 µs .
[314]: numeric overflow: 8000000060 (at pos 115)

Statement 'insert into saphanadb."teste" (PATH, NAME, MANDT ) select '/usr/sap/dataprovagent/s3', ...'
successfully executed in 103 ms 10 µs (server processing time: 93 ms 339 µs) - Rows Affected: 3334

 

 

 

 
werner_daehn
Active Contributor
0 Kudos
So it is the VBELN column. What is the datatype in the virtual table for this column? In the source it is a VARCHAR(10). If the target datatype is something like an integer, that would be too small.

You might want to consider using VARCHARs for those columns as well, given that they are left-padded with zeros.
Farid
Active Participant
Silly me ... I thought it was a memory issue, it works like a charm now by changing  the column type as you suggested.

This  opens new doors.

 

Thank you so much Werner !
werner_daehn
Active Contributor
No prob, we all need pointers into the proper direction once a while. Speaking of which, if you could let us know about your experiences with writing files, that would be appreciated.

For example I keep saying that select performance is okay, insert performance as well but mass data update/delete/upsert is not. In your example you can at least comment on the insert performance compared to other methods of writing and compared to read performance. My statements are based on the internal workings but need some data points.

 
angelo_sjc
Discoverer
0 Kudos
Ian,

 

Is it possible to be done with Hana DB on Business One?
ncrawford_89
Explorer
0 Kudos
Does anyone know what the latency is for moving the files over to S3 with the approach outlined above?
former_member607993
Contributor
0 Kudos

Hello ian.henry

Great blog and thanks for informative blog.

I would like to know if kafka adapter supports SAP CPI full fletched & how flexible it is (in comparison with SAP PO Kafka adapter) ?

 

Secondly does SAP CPI supports schema registry as of now, and also avro & json conversions?

Is it a tactical long term reliable solution to use via SAP CPI ? Not sure about the license cost and post using Kafka adapter which should not ideally end up in capacity or feature constraints ?

 

Looking forward for your valuable thoughts in elucidate. Thanks in advance! Cheers