Technology Blogs by SAP
Learn how to extend and personalize SAP applications. Follow the SAP technology blog for insights into SAP BTP, ABAP, SAP Analytics Cloud, SAP HANA, and more.
cancel
Showing results for 
Search instead for 
Did you mean: 
Kai_Mueller
Advisor
Advisor

Introduction to the Document Store (Part 1)


Overview


The SAP HANA JSON Document Store (also known as DocStore or Document Store) is a new feature which has been introduced with SAP HANA 2.0 SPS 01. The new store combines a relational and document-oriented database to a hybrid innovative technology which is unique for a variety of reasons, namely, its ACID compliant, fully integrated with SAP HANA in terms of access/query and administrative capabilities.

The embedded Document Store belongs to the group of "NoSQL" databases, more precisely to the document-oriented ones. These type of storing technologies are storing semi-structured documents (most JSON or XML) in collections without an explicit structure which offers high flexibility and compactness.

Beside the fact that the Document Store offers the possibility to use a document-oriented database directly and fully integrated in SAP HANA without the necessity of operating another independent database in parallel, it features full ACID properties. This way a single transaction may span all stores of SAP HANA and offer the same qualities in terms of atomicity, consistency, transaction isolation and durability. Given that the Document Store is a regular SAP HANA service, the known features Backup & Recovery, System Replication and Failover work out of the box without additional administrative overhead. SAP HANA allows interactions - especially joins - between collections and relational database objects like tables. Furthermore, with complex path expressions it's possible to extract relevant portions of the document.

Terms


Beside the known terms like tables or schemas, this blog and the documentation of the Document Store uses some (new) terms which will be explained as follows:

Semi-structured data: Data which is not fixed in its structure but has the structure information in itself. In contrary, structured data like tables has a constant or fixed structure which must be defined before inserting data.

Collection: A collection holds multiple documents and is assigned to a schema. This is comparable to a table with the difference that a collection doesn't have a predefined structure (column definition).

Document: A document in the Document Store is a semi-structured document in the JSON format. Such a document is like a row in a table. In this analogy the keys of the JSON document are the columns of the table.

Statement Examples


Since the Document Store is being used in relational database context, SQL is used as the query language. For that some new expressions and keywords where introduced to enrich SQL with the needs of the Document Store. In the following section the most commonly used statements are illustrated. This is only a simple statement explanation, for further detailed information kindly refer the SQL Reference.

Enablement of the Document Store


Since the document store is implemented as an additional store in SAP HANA that comes with its own process, it has to be enabled by the administrator in the SYSTEMDB for a specific tenant.
ALTER DATABASE <database> ADD 'docstore';

Create a collection


This statement creates a new collection called MyCollection into the current schema. This is like CREATE TABLE, but without defining the column characteristics. Users can create as many collections as needed.
CREATE COLLECTION MyCollection;

Drop collection


By using the DROP COLLECTION statement, the whole collection will be deleted. This statement behaves like the known DROP statements.
DROP COLLECTION MyCollection;

Insert


The insert statement of the document store takes one JSON document as an argument without an optional column definition. The newly document must be valid JSON, but documents may have different identifiers or structure.
INSERT INTO MyCollection VALUES({
"name":'John Doe',
"address": {
"city": 'Berlin',
"street": 'Street 22'
}
});

Select


Selecting values from a collection is similar to the selection from a table. Furthermore, it is possible to access nested fields via a path by using the dot operator. The statement is tolerant to non-existing fields.
SELECT "name", "address"."city" AS "city" FROM MyCollection WHERE "name" = 'John Doe';

This returns a result set with the columns name and city where the name equals John Doe.

Update


To perform updates on the data, the update statement should be used. Beside the simple updating of values, this operation can be used for adding or deleting field or for replacing whole documents.
UPDATE MyCollection SET "address"."city" = 'Munich' WHERE "name" = 'John Doe';

Delete


As the statement name implicates, it deletes documents from a collection.
DELETE FROM MyCollection WHERE "name" = 'John Doe';

Conclusion


SAP HANA already provides capabilities for graph, spatial, hierarchies and for relational tables of course. By introducing a document store the set of capabilities is enriched. This enables applications that are built on SAP HANA to use the best from each database technology. Especially they can mix different technologies with the well-known relational world in an intuitive way. This leads to many advantages, such as the ability to use a flexible and dynamic kind of storing data and the availability of using both database technologies at the same time. Overall it reduces administration overhead since only one database needs to be maintained and offers innovative development.

References


A short introduction into JSON

The Document Store in the SAP HANA Administration Guide

Maintenance of Collections in the SAP HANA Developer Guide (XS Advanced)

Document Store Statements in the SQL Reference

About this series


This blog series about the document store is splitted into two parts. In the first one, the document store is introduced with an overall overview.

In the second part a use case of the combination of relational and document oriented will be presented together with SQL samples.
56 Comments
nabheetscn
Active Contributor
0 Kudos
Thanks Kai for the blog. Do i see some similarities of document store to mongoDB, if yes than will be great if you can highlight in a table what is same and what is different?  It will be easier for everyone to compare and understand.
Kai_Mueller
Advisor
Advisor
Hello Nabheet,

thank you for your comment. In my opinion it wouldn't make sense to compare the DocStore and MongoDB on this detail level, since the DocStore is part of SAP HANA and SAP HANA and MongoDB are complete different databases.

In general your're right: they have the base idea in common. So, in both databases you have documents in the JSON format, CRUD operations to interact with them, collections which hold these documents and schemas/databases which storing collections.

One big difference is the query language: DocStore uses SQL (with JSON extensions), MongoDB JavaScript/JSON. Also, MongoDB can handle XML documents. The DocStore has the possibility to interact directly with a relational database, which isn't supported by MongoDB.

I hope this answers your question.

Best regards, Kai
mike_howles4
Active Contributor
Thanks, Kai for the post.  I've briefly played with DocStore in HANA Express.  A quick question, would DocStore be an appropriate HANA DB Store to save large text file content?  (In my case, assume it's something like markdown text for a wiki application) Any sort of item length limitations I may run into?
Kai_Mueller
Advisor
Advisor
0 Kudos
Hello Mike,

thank you for your nice words. Are you aware, that the SAP HANA JSON Document Store can save only JSON documents. If you want to save markdown, you need to convert or add it to a JSON document.

Best regards, Kai
mike_howles4
Active Contributor
0 Kudos
Hey Kai,

Yes, I realize it must be in JSON format, however I'm more asking if there is a size limitation per-item.

Example JSON item:
{
"docTitle" : "somedocument",
"content" : "Some potentially large markdown string"
}
Kai_Mueller
Advisor
Advisor
0 Kudos
Hello Mike,

the current limitation is 8MB per JSON document. Unfortunately I can't tell you the exact same of measurement, but just use 8MB as the upper limit.

Best regards, Kai
mike_howles4
Active Contributor
Thanks, Kai!  8MB per item is perfectly reasonable in my use case.
seasonfan
Explorer
0 Kudos
Hi Kai,

Very nice blog, thanks a lot for sharing.

I have a question here, will SAP ABAP benefit from json document store? Can I combine standard SAP report with this feature?

 

Thanks,

Season
Kai_Mueller
Advisor
Advisor
0 Kudos
Hello Season,

thanks for the kind words. As far as I know you can't use the DocStore directly in ABAP since ABAP only knows tables as storage objects.

But you can try to use collections and json in AMDPs with SQLScript as the language.

Best regards,

Kai
0 Kudos
Hi Kai,

Thanks for this nice blog about Document Store (both Part1 and Part2).

I tried using HANA Document store through Java App using JDBC(ngdbc.jar). I am able to perform CREATE COLLECTION and INSERT INTO Collection but not able to perform Select on collection as using ResultSet to get the value is not serving the purpose of getting JSON Data from collections.

Is there any way to get data from Collection in Java (any API or Procedure) ?

Thanks..
Amitanshu
Kai_Mueller
Advisor
Advisor
Hello Amitanshu,

JSON is returned as a CLOB in the first column, which you can convert to a String quite easy:
try (Connection connection = DriverManager.getConnection(url, userName, password)) {
final String query = "SELECT * FROM COLL_A";

try (PreparedStatement statement = connection.prepareStatement(query)) {

try (ResultSet resultSet = statement.executeQuery()) {
resultSet.next();
System.out.println(clobToString(resultSet.getClob(1)));
}
}
}

private static String clobToString(final Clob clob) throws SQLException, IOException {
try (final Reader r = clob.getCharacterStream()) {
final StringBuffer buffer = new StringBuffer();
int ch;
while ((ch = r.read()) != -1) {
buffer.append((char) ch);
}
return buffer.toString();
}
}

Of course this will only convert this to a string, but then you can use well known JSON libraries to work further with this.

I hope this helps.

Best regards,

Kai
Thank Kai !! It's working perfectly. I was getting CLOB but was doing it using ResultSet metadata and hence it was not working as desired.  🙂

 

Best Regards..

Amitanshu
0 Kudos
Hi Kai,

We are wondering if SAP Hana DocStore can supports same level of data aggregation capabilities as MongoDB provides like aggregation pipelines or MapReduce or bulk operations. If yes, how can we achieve it?

Regards,

CS
Kai_Mueller
Advisor
Advisor
0 Kudos
Hello Chatar,

you can use all SQL aggregation capabilities offered by SAP HANA, but there are no special ones for the SAP HANA DocStore.

Best regards,

Kai
former_member666834
Discoverer
0 Kudos
Hello Kai, I tried the following:

CREATE COLLECTION TABLE MyCollection1;
INSERT INTO MyCollection1 VALUES ('{"BooleanField":true}');

However, when I query using:
SELECT * FROM MyCollection1 WHERE "BooleanField" = true

it does not return any rows.

How do we insert and query boolean and date fields in the doc store?

Thanks..
Kai_Mueller
Advisor
Advisor

Hello Shrikant,

the SELECT can’t use the SQL boolean, which is known limitation. Please use this statement instead: SELECT * FROM MyCollection1 WHERE “BooleanField” = TO_JSON_BOOLEAN(true)

This is also described here: https://help.sap.com/viewer/3e48dd3ad36e41efbdf534a89fdf278f/2.0.04/en-US/680ddaa91a1145efa02a3988b6...

Regarding dates: dates are no datatype in JSON, so you need to do a string compare here.

Best regards, Kai

former_member666834
Discoverer
0 Kudos
Thanks Kai, this is helpful.
ulasalasreenath
Discoverer
0 Kudos
Hello Kai, Can we create an index for a collection?
Kai_Mueller
Advisor
Advisor
Hello Ulasala,

no, currently not. But this feature request is already in the backlog.

Best regards,

Kai
jbaysdon
Discoverer
0 Kudos
Hi, Kai.  Have you ever created a Spring data repository for a HANA document store?  If so, can you describe it?  Extending JpaRepository doesn't seem to be a good fit.
Kai_Mueller
Advisor
Advisor
0 Kudos
Hello Joseph,

I'm sorry but I never tried this and I don't know how this could work. What I guess is, that JPA/Spring etc. are not designed to work with JSON documents but I'm not an expert here.

Maybe it could help if you figure out how to support other well known document-oriented databases in these frameworks and then try to adapt this for the DocStore.

Best regards,

Kai
jbaysdon
Discoverer
0 Kudos
Thanks, Kai.  There is repository support in Spring for MongoDB (org.springframework.data.mongodb.repository.MongoRepository).  I'll give it a try.
kevindass
Participant
0 Kudos

Kai,

I did follow SAP Help documentation. I actually get below error in AMDOP class. However I am able to do all CRUD in SQL console on HANA Studio.

“DEMOCOLLECTION” is unknown. ABAP objects and DDIC objects must be declared in the METHOD statement. Local names must start with “:” here.

This error comes when accessing from system schema(example SAPCAR or SAPSLT etc) however works fine with user schema(example KDASS)

Regards,

Kevin Dass

Kai_Mueller
Advisor
Advisor
0 Kudos
Hello Kevin,

I've seen you asked the same question also here. I guess the question is the more right place to ask. I'm not an ABAP expert but I guess you need to make the collection somehow visibel for ABAP, e.g. in the DDIC. Please follow up in the question since there somebody already figured out how to do that.

Best regards, Kai
Bruhn
Explorer
0 Kudos
Hi Kai,

We are trying to apply the doc store as a quick ingestion mechanism for the result of REST calls. Then later on parsing the json for the data which are needed in these particular use cases. I have a couple of questions:

a) During insert if the JSON is invalid - then SQL error 8 is raised "Invalid Argument" - I cannot make an exit handler for this error code as it is not supported - are there any ways in which I can validate the json stream prior to saving.

b) when using JSON_TABLE we are able to use nested paths for dealing with arrays in a "generic way" Is something similar when using the doc store directly.

 

/Bruhn
Kai_Mueller
Advisor
Advisor
0 Kudos
Hi Michael,

a) not that I know. HANA expects that the JSON you send is valid

b) no, this is not possible but we are aware that this would be useful

Best regards, Kai
former_member697010
Discoverer
0 Kudos

Hello Kai,

Any release date for this feature? I find it hard to use the document store without it.

It would be also great if we could tier the collection using NSE.

Kai_Mueller
Advisor
Advisor
0 Kudos
Hello Leonardo,

I'm sorry but I'm not aware of any date which I can easily communicate externally, especially inside a blog.

The tiering with NSE is a complete different story, so again as far as I know it's in the backlog.

Best regards, Kai
rajarshi_muhuri
Active Participant
0 Kudos
I seem to be able to import single json documents  but not when its a array of json objects

 

 

 
[
{
"name": "rajarshi",
"lname": "muhuri",
"place": {
"city": "dallas",
"state": "tx"
}
},
{
"name": "rishi",
"lname": "muhuri",
"place": {
"city": "dallas",
"state": "tx"
}
}
]

 

I can only insert when I break it into

 
{
"name": "rajarshi",
"lname": "muhuri",
"place": {
"city": "dallas",
"state": "tx"
}
}

 

also for python insert , is this the only way ?
x = 'json as string'

SQL_CMD = 'INSERT INTO SYSTEM.COLL2 VALUES (?)'
cursor.execute(SQL_CMD,x)

 
Kai_Mueller
Advisor
Advisor
Hello Rajarshi,

this is a normal SQL limitation. The SQL interface normally can only insert one row at once which applies to documents, too (as far as I know).

Normally all clients support batch/bulk insert which can be used for documents too. Please refer to executemany here.

Best regards,

Kai
rajarshi_muhuri
Active Participant
0 Kudos

Hi Kai ..

 

thanks for clarifying

For normal data, I move the data into tuples.

but not sure how to do that for document store so that I can call executemany().

currently I am doing this ,  

for _l in _list:
x=json.dumps(_l)
SQL_CMD = 'INSERT INTO TABLE_A VALUES (?)'
cursor.execute(SQL_CMD,x)

 

 

Kai_Mueller
Advisor
Advisor
Hi,

you call executemany as you would do for a table
con  = dbapi.connect(...)
cur = con.cursor()

docs = [{"a": 2}, {"b": 2}]
params = [(json.dumps(doc),) for doc in docs]

cur.executemany("INSERT INTO COL VALUES(?)", params)
rajarshi_muhuri
Active Participant
0 Kudos

thanks, that helped 

 

0 Kudos

Hello Kai,

 

Does JDBC PreparedStatement work on the HANA document store queries?

It gives me the following error when using the java PreparedStatement:

JDBCDriverException@195 "com.sap.db.jdbc.exceptions.JDBCDriverException: SAP DBTech JDBC: [7]: feature not supported: please specify the types for parameter: line 1 col 67 (at pos 66)

On the other hand, it works with the plain SQL "Statement" in java.

The query which I am using for my prepared statement is as follows:

SELECT * FROM "COLL" WHERE "_id"."tenantId" = ?

Could you please guide me here?

 

Best,

Khavya

Kai_Mueller
Advisor
Advisor
0 Kudos
Hello,

can you please provide a short example in Java?

Thanks, Kai
0 Kudos

Sure.

I am trying to use PreparedStatement and set the parameters in the query as follows:

try(PreparedStatement pst = conn.prepareStatement("SELECT * FROM \"COLL\" WHERE \"_id\".\"tenantId\" = ? ")) {
pst.setString(1, tenant);
rs = pst.executeQuery();
List<Model> models = buildModelsFromResult(rs);
return models;
}
Kai_Mueller
Advisor
Advisor
Thanks a lot. Bad news: the document store doesn't support parameters/prepared statements if you running on SAP HANA on-premise or SAP HANA Service.

See here: The Document Store does not support SQL parameters. This may affect the application's ability to prevent SQL injection.

HANA Cloud supports this feature (see here😞 Document Store supports the usage of SQL Parameters.

 
rajarshi_muhuri
Active Participant
0 Kudos

When I use the doc store within SQL anonymous block , select delete etc works

DO BEGIN

DECLARE ID NVARCHAR(20) ;

ID = '1';

SELECT * FROM XCOLLECTION X WHERE X."SID" = :ID ;

END

 

But the same code does not work for select insert or delete when I put it inside a stored procedure or function. 

I also tried explicit  CAST(:ID as NVARCHAR)

Kai_Mueller
Advisor
Advisor
0 Kudos
Hi,

so on HANA SPS05 it works fine:
CREATE collection XCOLLECTION;
INSERT INTO XCOLLECTION values({"SID": '1'})

CREATE PROCEDURE foo AS BEGIN
DECLARE ID NVARCHAR(20) ;
ID = '1';
SELECT * FROM XCOLLECTION X WHERE X."SID" = :ID ;
END
CALL foo();

Result is {"SID": "1"}. Can you please share your whole scenario and the HANA version?

Thanks, Kai
rajarshi_muhuri
Active Participant
0 Kudos

My ,mistake , I did not clarify the nuances .

 

It works when I assign the value hardcoded , but normally we read from a table and assign , thats where it fails . e.g

 

create COLUMN TABLE PROG
(
ID nvarchar(16)
);


INSERT INTO PROG VALUES (1 );
alter PROCEDURE foo AS BEGIN
DECLARE ID INTEGER ;
SELECT ID INTO ID FROM PROG ;
SELECT * FROM XCOLLECTION X WHERE X."SID" = :ID ;
END;

CALL FOO()

The above gives

Error: (dberror) [7]: feature not supported: "CDE_4"."FOO": line 15 col 1 (at pos 88): WHERE clause with unsupported expressions on collection tables

 

while

 

alter PROCEDURE foo AS BEGIN
DECLARE ID INTEGER ;
SELECT ID INTO ID FROM PROG ;
SELECT * FROM XCOLLECTION X WHERE X."SID" = ID ;
END;

 

returns nothing

 

 

Kai_Mueller
Advisor
Advisor
That's beause in the second "ALTER PROCEDURE" you still need to use ":ID" instead of "ID". However, returning nothing is weird and misleading. 'll raise internally attention to this.
mkemeter
Product and Topic Expert
Product and Topic Expert
The second behaviour can be explained: If there is no colon, "ID" gets interpreted as an attribute of the collection. If a document does not contain the attribute, it returns "null". So, the where-condition is probably equivalent to "WHERE X.SID IS NULL" and thus returns no results.
Kai_Mueller
Advisor
Advisor
0 Kudos
Thanks for the explanation!
mkemeter
Product and Topic Expert
Product and Topic Expert
0 Kudos

I just verified on my HANA Cloud instance that the following code works does not work (EDIT: it really does not):

CREATE COLUMN TABLE PROG
(
ID nvarchar(16)
);
INSERT INTO PROG VALUES (1 );

CREATE COLLECTION XCOLLECTION;

DO BEGIN
DECLARE ID INTEGER ;
SELECT ID INTO ID FROM PROG ;
SELECT * FROM XCOLLECTION X WHERE X."SID" = :ID ;
END;

 

Which HANA version do you use?

rajarshi_muhuri
Active Participant
0 Kudos
Thanks for the answer . I am using HANA  2.0 SP 5  on Premise version . It does not work there .

 

For the moment, I am using the work-around
BEGIN
DECLARE SQL_STMT NVARCHAR(256);
DECLARE ID NVARCHAR(16);
SELECT ID INTO ID FROM PROG;
SQL_STMT = 'DELETE FROM XCOLLECTION C WHERE C."ID" = '||''''||:ID||'''';
EXECUTE IMMEDIATE :SQL_STMT ;
END ;
mkemeter
Product and Topic Expert
Product and Topic Expert
I just checked a second time and I have to take my statement back as I did a mistake when testing previously: The usage of variables is also not supported on HANA Cloud. So, the code won't work there either.

For the moment, your workaround seems like a viable alternative. Another option would be to use a join between the relational table and the collection as described in this blog.
rajarshi_muhuri
Active Participant
Hi Mathias

 

Thanks for the blog link , and I saw the bind_as_value statement , and that works

 
DO BEGIN
DECLARE ID nvarchar(16) ;
SELECT ID INTO ID FROM PROG ;
SELECT * FROM XCOLLECTION X WHERE X."SID" = bind_as_value(:ID) ;
END;
mkemeter
Product and Topic Expert
Product and Topic Expert
Thanks for lining this out. I didn't realize it's exactly describing your scenario 😊
0 Kudos

Hello Kai,

Can we create an index for a collection? Now.
Can you please suggest any documentation if it is supported now.

Thanks, Gaurish

Kai_Mueller
Advisor
Advisor
Hello Gaurish,

In HANA Cloud you can now create indexes: https://help.sap.com/docs/HANA_CLOUD_DATABASE/f2d68919a1ad437fac08cc7d1584ff56/ad9063aa6b6d479faac18bacb6caf145.html?locale=en-US

Best regards, Kai