
Related Blogs: |
As I mentioned in my blog last year, the SQL on Files capability in SAP HANA Cloud has unlocked new potential by providing direct read-only SQL access to files stored in SAP HANA Cloud, data lake Files since SAP HANA Cloud QRC 03/2024.
Starting with SAP HANA Cloud QRC 04/2024, this capability has been further extended to support direct access to the most common external object storages listed below, in addition to SAP HANA Cloud, data lake Files.
Although the method for creating a remote source varies by cloud service provider, the process for creating virtual tables, the supported file formats, and the read-only access remain the same as when using SQL on Files with SAP HANA Cloud, data lake Files. Since these aspects are already explained in my previous blog, let me focus on the differences in this one.
Additionally, as you may have already noticed from the above diagram, read-only access to external Delta Sharing is also supported from SAP HANA Cloud QRC 04/2024 with a dedicated Delta Sharing adapter.
And when it comes to CSV formats, two new options are now included to provide better flexibility and reduce the effort required when reading files in CSV format, starting with SAP HANA Cloud QRC 04/2024.
Another piece of good news is that the existing limitation on Delta tables, which currently supports only reader version 1 for both direct file access and Delta sharing, has been lifted starting with SAP HANA Cloud QRC 01/2025.
Alright, with these points in mind, let's dive deeper into the details with some examples.
As highlighted, starting from SAP HANA Cloud QRC 04/2024, the SAP HANA Database SQL on Files feature has begun to support direct read-only access to files (CSV, Parquet, and Delta tables) stored in external object storage.
The supported external object storage options are, as mentioned above, Amazon S3, Azure Blob Storage or Azure Data Lake Storage (ADLS) Gen2, and Google Cloud Storage (GCS).
For Amazon S3 and GCS, both path-style and virtual-host-style endpoints are supported (GCS also supports path-style regional endpoints). For Azure, both standard and DNS endpoints are supported.
The methods for creating a remote source differ according to the specific standards and integrations established by each cloud service provider. The example below shows how to create a remote source to Amazon S3.
-- create a remote source to Amazon S3
CREATE REMOTE SOURCE <remote_source_name> ADAPTER "file" CONFIGURATION '
provider=s3;
endpoint=<s3_endpoint>;'
WITH CREDENTIAL TYPE 'PASSWORD' USING 'user=<access_key>;password=<secret_key>;
As described, the file adapter is still in use but now with the parameter provider=s3.
A remote source can also be created via the SAP HANA Database Explorer UI, and the example below shows how to create a remote source to Google Cloud Storage via the UI.
Please refer to the links below for further details on creating a remote source for each external object storage.
Although the process for creating a remote source differs, as mentioned above, the method to create a virtual table is the same as with SAP HANA Cloud, data lake Files and can be found in my previous blog.
Also, starting from SAP HANA Cloud QRC 04/2024, SAP HANA Database SQL on Files has started to support external Delta Sharing.
This means you can create a remote source to Databricks Delta Sharing and also create read-only virtual tables by pointing to the tables exposed by Delta Sharing protocol.
-- create a remote source to Delta Sharing
CREATE REMOTE SOURCE <remote_source_name> ADAPTER "deltasharing" CONFIGURATION '
provider=databricks;
endpoint=<endpoint in config.share>;'
WITH CREDENTIAL TYPE 'OAUTH' USING 'access_token=<bearerToken in config.share>';
As described, a dedicated deltasharing adapter is used with the parameter provider=databricks, instead of the file adapter.
Similar to external object storage access, the remote source to Databricks Delta Sharing can also be created via the SAP HANA Database Explorer UI.
For this read-only access, Delta tables must be added to the Delta Sharing of Databricks. By creating a new recipient with a token and token lifetime, the credential (config.share) should be downloaded. Please also note that, unlike direct file access in SQL on Files, the method of creating virtual tables follows the same approach as with other SAP HANA smart data access (a.k.a., SDA) adapters.
-- create a virtual table
CREATE VIRTUAL TABLE VT_TAB AT <remote_source_name>.<share_name>.<schema_name>.<table_name>;
Although Delta tables in files can be directly accessed with the file adapter, support for the Delta Sharing protocol via a dedicated deltasharing adapter can offer more flexibility if Delta Sharing is already in use and there is a requirement to use it as a unified access protocol within your organization.
Please also refer to the link below for further details.
For better usability and reduced effort, two new options were introduced with SAP HANA Cloud QRC 04/2024.
One is SKIP FIRST N ROWS, which literally skips the first N number of rows when creating a virtual table by pointing to CSV files. Obviously, if the number of rows is less than N, an error will be thrown.
-- skip the first 5 rows
CREATE VIRTUAL TABLE TEST_SCHEMA.TAB1 (
A INT,
B INT
) AT TEST_REMOTE."/path/to/csv/files1"
AS CSV SKIP FIRST 5 ROW;
This is useful if there are multiple header rows, or if you intentionally ignore certain rows for any business reasons.
Another option is COLUMN LIST IN FIRST ROW, which is similar to the above option but can be used to more precisely recognize the first row as the header and ensure the match between virtual table field names and the field names in the file header.
-- recognize the first row as the header
CREATE VIRTUAL TABLE TEST_SCHEMA.TAB2 (
A INT,
B INT
) AT TEST_REMOTE."/path/to/csv/files2"
AS CSV COLUMN LIST IN FIRST ROW;
In this case, since the case sensitivity of field names is also compared, if the field name of the virtual table does not match the field name in the file header, an error is thrown.
In addition, the VIRTUAL_TABLE_FILES system view has been extended to include these two new configuration values. Please also refer to the link below for further details.
As briefly mentioned above, before QRC 01/2025, if the reader version of the Delta table was higher than 1, a virtual table could not be created. In most use cases, supporting reader version 1 was sufficient because, when converting Parquet files into Delta tables, the lowest possible reader version is generally chosen.
Nevertheless, if a specific data type like TimestampNtz (Timestamp without timezone) is included in your data or specific Delta table features like the deletion vector are enabled, a higher reader version like 3 is required, and SQL on Files could not read these before.
However, starting with SAP HANA Cloud QRC 01/2025, SQL on Files has begun supporting reader versions 2 and 3, meaning the existing limitation has now been fully lifted.
In conclusion, the enhanced capabilities of SAP HANA Database SQL on Files in SAP HANA Cloud provide more flexible and efficient ways to manage your data. With the ability to access vast volumes of data from well-managed object storages, you can fully leverage the robust features of the SAP HANA Cloud as a platform. This includes in-memory processing, elasticity, the Predictive Analysis Library (PAL), machine learning capabilities, and more.
Furthermore, these advancements facilitate seamless data integration with SAP applications like S/4HANA, empowering you to derive even greater value by combining your object storage data with data from SAP applications. If you're interested in reading data from SAP applications like S/4HANA in SAP HANA Cloud, please refer to my blog, Taking Data Federation to the Next Level: Accessing Remote ABAP CDS View Entities in SAP HANA Cloud.
As we remain committed to innovation, stay tuned for upcoming updates that will continue to expand and enrich your SAP HANA Cloud experience.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
User | Count |
---|---|
12 | |
12 | |
11 | |
11 | |
11 | |
9 | |
8 | |
7 | |
7 | |
6 |