SAP is proud that SAP HANA is the first major database platform that is specifically optimized for persistent memory.
Most applications rely solely on the operating system for memory allocation and management. Of course, SAP HANA needs to allocate memory from the operating system just like every other application. Once allocated, SAP HANA prefers to exert a much higher degree of control over its memory management. The reason for that is simple: It allows for a much higher degree of optimization. This is especially important for an in-memory database like SAP HANA. This paradigm extends seamlessly to persistent Memory.
In other words: SAP HANA knows which data structures benefit most from persistent memory. SAP HANA automatically detects persistent memory hardware and adjusts itself by automatically placing these data structures on persistent memory, while all others remain in DRAM.
Given these characteristics, an excellent candidate for placement in persistent memory is the column store main. It is heavily optimized in terms of compression, leading to a very stable – non-volatile – data structure. The main store typically contains well over 90% of the data footprint in most SAP HANA databases, which means it offers a lot of potential. Furthermore, it is reconstructed rarely during the delta merge. A process that is only triggered after a certain threshold of changes to the database table was reached. For most tables, a delta merge does not happen more than once a day.
This design fits SAP HANA’s architecture perfectly. The separation of write-optimized delta and read-optimized main stores and the characteristics of both are a perfect match to the respective strengths of DRAM and persistent memory.
Data loading at startup
Something quite unique to SAP HANA is the consequent implementation of its “in-memory first” paradigm. All database operations are performed directly on the in-memory data structures, instead of first applying everything on persistent block-based storage (e.g., SSDs) and then simply replicating the changes to an in-memory cache like many legacy databases in the market. This means that a table must be loaded to main memory before any operation – read or write – can be performed on this table. For the vast majority of tables – those in SAP HANAs column store – this happens asynchronously after a restart of the database. The database is fully available during that time, but queries to tables that are not yet fully loaded might experience reduced performance.
With pure DRAM, this initial load happens every time the database is started, which means also after planned or unplanned outages. Systems that cannot tolerate the impact on performance often use system replication to circumvent this and, in case of a required restart, switch the workload to the replicated instance. The big disadvantage is that you need an entire second set of hardware for this – complete with CPUs, network, main memory and storage.
With persistent memory, the initial load of the column store is no longer necessary. Column store data is retained across database and even server restarts, which decreases the loading time significantly.
At Sapphire 2018 in Orlando, SAP co-founder and chairman Hasso Plattner presented the very first numbers on the improvement on startup times with persistent memory. Based on a 6 TB instance of SAP HANA, startup time including data loading improved by a factor of 12.5 – from 50 minutes with regular DRAM to just 4 minutes with persistent memory. This means a significantly lower boundary for planned business downtimes – for example due to an upgrade – of a mere few minutes, instead of almost an hour. Reducing business downtimes by this magnitude is otherwise only possible by employing measures like SAP HANA system replication.