-------------------------------------------------------------------------------------------------------------------------------------
ALWAYS REFER UPDATED GUIDE FROM IBM AND SAP TO IMPLEMENT THE SOLUTION.
-------------------------------------------------------------------------------------------------------------------------------------
In my last
blog, I highlighted different Persistent Memory approach for SAP HANA by different vendors - Intel & IBM. But now in this blog, I will elaborate on the configuration steps that needs to be followed in order for SAP HANA database to use IBM's vPMEM.
But before that, I would like to highlight Technology Overview of Persistent Memory and followed by IBM's adoption of Virtual Persistent Memory (vPMEM)
NOTE: SAP documentation refers to persistent memory as non-volatile memory (NVM), while IBM documentation often uses the term storage class memory (SCM). The term non-volatile DIMM (NVDIMM) persistent memory is also used. So you will find different terminology addressing the same persistent memory.
IBM vPMEM - Technology Overview
Virtual Persistent Memory (vPMEM) is an enhancement of IBM's advanced virtualization platform (PowerVM) and introduces the ability to configure persistent volumes using the conventional DRAM memory modules available in every IBM Power9 Systems.
Consequently,
no special or additional hardware components are required. Instead, only a firmware upgrade must be performed to enable vPMEM on Power9 Systems.
Since vPMEM is built on DRAM technology, it has the same performance characteristics as DRAM, which enables IBM Power9 user to significantly speed up restarts of SAP HANA during planned maintenance as well as unplanned outages, without compromising performance in production.
Let us now understand the architecture of how operating systems can provide persistent memory services and how application software can utilize them. The PowerVM or Linux on Power implementation can be seen in below figure -
Source: Planning & Implementation Guide - SAP HANA & PowerVM vPMEM
At the bottom of the figure, PowerVM hypervisor presents the persistent memory devices to the operating system in a technology agnostic manner. Depending on the physical device capabilities, the PowerVM hypervisor may also be able to virtualize persistent memory devices and segmenting them into smaller capacity volumes, which can be assigned to different logical partitions (LPARs).
Once persistent memory is assigned to an LPAR, individual devices are presented by the Linux Operating System as generic non-volatile DIMM devices, /dev/nmem<#>. The management tool
ndtcl is used to interface with the nvdimm driver to configure and provision these "nvdimm" devices into regions, namespaces, and persistent memory volumes.
Region: A region is a grouping of one or more nvdimm devices. Commonly a region is formed from devices from the same numa node.
Namespace: A namespace is a partition of a region either whole or part. Namespaces are associated with a mode, which enables different access method to the persistent memory. Four modes are available -
- fsdax (filesystem direct access) - Persistent memory is presented as block device (/dev/pmem<#>) and supports XFS and EXT4 filesystems. This mode provide direct access (DAX) support, which bypasses the Linux page cache and performs reads and writes directly to the device. For direct access through load and store instructions, the device can be mapped into the address space of the application process with mmap(). The default mode of a namespace is fsdax
- devdax (device direct access) - Persistent memory is presented as a character device (/dev/dax<#>.<#>). This mode also provides DAX support.
- sector - Persistent memory is presented as a block device (/dev/pmem<#>s) and supports
any filesystem. This mode is useful for applications which are not persistent memory aware.
- raw - This mode provides a memory disk with no DAX support.
For SAP HANA, only fsdax mode is employed.
Below figure depicts an example of the fsdax stack exposing nvdimm devices to applications.
Source: Planning & Implementation Guide - SAP HANA & PowerVM vPMEM
vPMEM volumes are managed on the system Hardware Management Console (HMC). They are defined per LPAR and are not directly sharable or transferable to other LPARs.
While creating Persistent Memory for LPAR, vPMEM volumes are specified to be striped across NUMA nodes or to be NUMA node contained. As HANA is NUMA aware application, vPMEM should be provisioned on a NUMA node basis so that their NUMA node associativity is clearly defined. That is, PowerVM hypervisor should allocate DRAM exclusively from one NUMA node to serve as single vPMEM volume.
NOTE: vPMEM volumes cannot be resized. Instead, they are deleted and new vPMEM volumes can be created with the desired size.
HANA Adoption of Persistent Memory
Background: HANA In Memory database keeps all its data in main memory or DRAM. Disk storage is still required to allow restarts in case of power failure and for permanent persistency. The idea behind HANA was to integrate transactional and analytics workload within the same database management system. This is been achieved with columnar engine which exploits modern hardware, compression of database content and maximum parallelization. In the transactional workload of the SAP Business Applications, more than 80% of all statements are read access. The remaining data write operations consists mainly of inserts, a few updates, and very rare deletes.
HANA’s relational in-memory columnar store is tuned for such enterprise workloads. The trade-off favors the large majority of read accesses through the choice of a columnar database. The performance impact on the row-oriented smaller transactional workload is acceptable due to the fast in-memory access.
HANA compress its columnar store table with extensive compression technique in order to optimize the utilization of available memory. But compression pays off only when the data does not change often (computational effort spent to compress data is leveraged over a longer duration). So, in order to overcome this costly overhead of compression every time data gets changes, HANA columnar tables are divided into two table fragments: Main and Delta. Main Fragments are reader friendly, it contains most of the data and it changes rarely (it uses sorted dictionaries, N-bit and other compression techniques). On the other hand, delta fragments are writer friendly, it contains remaining smaller part of data.
Currently both Main and Delta table fragments are stored in main memory or DRAM. But now with IBM’s vPMEM in place, HANA database has been intelligently designed where it identifies the presence of Persistent Memory and stores its Main table fragment of columnar store in vPMEM and Delta fragment, Row Store in DRAM.
As main data fragment represents 95% of database data which is now stored in Persistent Memory which leads to significantly less time loading data from traditional persistent storage into memory every time HANA is restarted or server is rebooted. So now HANA doesn’t have to wait for data to be loaded into memory as most of the data is already in the memory. This enables rapid restart and recovery times immediately with full performance benefits of SAP HANA.
Source: Planning & Implementation Guide - SAP HANA & PowerVM vPMEM
SAP HANA requires persistent memory to be configured in fsdax mode as discussed earlier. Also in order to take advantage of SAP HANA NUMA optimizations, it is required that the vPMEM volumes are configured per NUMA node i.e. DAX enabled filesystem mounted on the server should have vPMEM volumes from single NUMA node.
In above figure, instead of standard file I/O read and write calls, SAP HANA employs memory-mapped file I/O. By mapping the files directly into its address space, the application can use load and store CPU operations to manipulate its data.
Sizing vPMEM for SAP HANA
Sizing SAP HANA with DRAM and persistent memory follows the same rules as sizing for SAP HANA with a pure-DRAM configuration. The same sizing tools should be used to ensure that the hardware has sufficient capacity. Always refer to SAP Note
2786237 - Sizing SAP HANA with Persistent Memory to find out up-to-date information.
- SAP HANA quicksizer: Link
- Sizing report for SoH and S/4HANA: SAP Note 1872170
- Sizing report for BWoH and BW/4HANA: SAP Note 2296290
- SQL reports attached to the note 2786237 for an overview of memory usage in a
current system
Note that the ratio restrictions between DRAM and PMEM documented in above SAP Note does not apply to the Power platform.
In order to get better understanding let us take an example where business want to perform conversion of their existing ECC system to S/4HANA. In order to get HANA sizing, we will run Sizing report for S/4HANA as mentioned above and the output we will get from the report will be something as shown below -
-------------------------------------------------------------------------------------------------------------------------------------
If you have applied latest correction to the sizing report, you will get below memory sizing calculation
- HANA Sizing with DRAM
- HANA Sizing with DRAM + Persistent Memory
If you are planning to implement any of the persistent memory (IBM's vPMEM or Intel's DCPM) technology's that are available in the market today, sizing guidelines for HANA remains the same.
As IBM's vPMEM is an enhancement on DRAM technology, we need to consider entire 6867.6 GB for DRAM. On top of 6867.8 GB DRAM, you can mount 3284.9 GB of persistent volume (which is described in sizing report), which will then be mounted as dax enabled file system on your HANA server.
In Intel's configuration, you need to follow DRAM to PMEM ratio as described in SAP Note 2786237 and configure you server accordingly. For more details on Intel's and IBM's persistent memory you refer my earlier blog.
Implementing vPMEM with SAP HANA
Prerequisites
The following list of details the minimum hardware and software levels required to configure and implement SAP HANA with IBM PowerVM vPMEM -
- IBM POWER9 System with Firmware FW940
- IBM Hardware Management Console (HMC) Virtual Appliance v9.1.940
- SAP HANA 2.0 SPS 04 Revision 44 (If you have enabled high isolation, request you kindly use Revision 46+ as on Revision 44, tenant database with high isolation does not start. I have described this issue in details below.)
- SUSE Linux Enterprise Server 15 SPS 01
- Kernel version 4.12.14-197.21.1
- ndctl version 64.1-3.3.1
To run the SAP Hardware and Cloud Measurement Tool (HCMT) with vPMEM, the minimum
tool version is SAP HANA 2 SPS04 Revision 46.
NOTE: Always refer latest version of SAP Note
2188482 - SAP HANA on IBM Power Systems: Allowed Hardware for up-to-date information on vPMEM on Power9
Configuring LPAR profile settings for vPMEM
Refer
vPMEM SAP HANA Whitepaper for more details on how to confgure LPAR profile settings for vPMEM
Managing vPMEM volumes with the Hardware Management Console
Refer
vPMEM SAP HANA Whitepaper for more details on manging vPMEM volumes from HMC. In order to mange vPMEM volumes on existing HANA server, you have to shutdown lpar and make it not activated before you can create, rename and delete vPMEM volumes.
NOTE: The size of our HANA database is 241.09 GB and as HANA Column Store - Main Fragment represents 95% of this data which is ~230 GB of Main Fragment. We have created vPMEM volumes of size 250 GB on top of 512 GB DRAM
Preparing vPMEM volumes for use with SAP HANA
As discussed, HANA places Column Stores - Main Fragments data into files located in XFS filesystem mounted with DAX option. So after creating vPMEM volume and booting the LPAR, the volumes are presented as non-volatile DIMM device /dev/nmem<#> by the operating system.
We will manage DIMM device using ndctl tool -
# ndctl --list-cms (Dispaly all commands that are available with ndctl tool)
As HANA is NUMA aware application, vPMEM should be provisioned on a NUMA node basis so that their NUMA node associativity is clearly defined. That is, PowerVM hypervisor should allocate DRAM exclusively from one NUMA node to serve as single vPMEM volume.
# numactl --hardware (Display CPU and Memory association to NUMA node)
NOTE: CPU is not optimally configured as this is DEMO system.
# ndctl list -uv (Display the size of vPMEM volume and the numa node on which it is mounted)
As you can see, entire vPMEM volume (~250 GB) is mounted on one NUMA node (node0) where all of server CPU's and Memory's are also associated. This avoid hop penalty while accessing.
Check for region created by OS for vPMEM volume
# ndctl list -Ru
For volume of size 256 GB, OS created one region
Create namespace each of size equal to region size
As we have only region for our vPMEM volume, we will create only one namespace. But if you have multiple region, create namespaces accordingly.
# ndctl create-namespace -r region0
At this point, persistent memory block devices have been prepared and are presented by the operating system as /dev/pmen<#>.
Create and mount XFS filesystems with the DAX option for each pmem namespace
Note: DAX option skips the page cache and uses file system blocks directly as page cache entries. This requires the block size to be same as operating system pagesize, which is 64K on Power Systems.
# mkdir -p /hana/pmem/pmem0
# mkfs.xfs /dev/pmem0 -b size=64k -s size=512
# mount -o dax /dev/pmem0 /hana/pmem/pmem0
# chown -R <sid>adm:sapsys /hana/pmem
# chmod -R 700 /hana/pmem
Caution: Do not create the pmem filesystem mountpoints under filesystem mountpoints other than "/". In such cases, SAP HANA does not determine the DAX attribute properly and will not use those filesystems to store the data.
Note that the block device name, for example, pmem0 and pmem1 from above, may change after reboot. For any automated mounting of the associated filesystems, it is recommended to use the filesystem UUID, as with any other filesystem.
vPMEM Configuration in HANA
New Installation
The persistent memory feature can be set up during system installation using the SAP HANA database lifecycle manager (HDBLCM) which uses two parameters to, firstly, enable persistent memory (
use_pmem) and secondly the
pmempath parameter to set the basepath.
--use_pmem --pmempath=<path to pmemX>[:<path to pmemY>]
Enabling on Existing Installation
SAP HANA Configuration files are stored on the server at the following locations according to layer:
Default: /usr/sap/SID/HDB<nn>/exe/config (read only)
System: /hana/shared/SID/global/hdb/custom/config
Database: /hana/shared/SID/global/hdb/custom/config/DB_<dbname>
Host: /usr/sap/SID/HDB<nn>/<hostname>
By default, SAP HANA usage of persistent memory volumes is specified at the host level. All HANA services managed by a single SAP HANA Global Allocation Limit (GAL) will share a set of persistent memory volumes.
[persistence]
basepath_persistent_memory_volumes = /hana/pmem/pmem0
NOTE: You have to maintain this parameter at host level. Also restart is required after setting the value.
Activate persistent memory storage for the database in the indexserver.ini configuration file of HANA
[persistent_memory]
table_default = on
With the above two configuration, the default behavior for using SAP Persistent Memory is determined automatically and all tables will use persistent memory by default. This default behavior can be overridden at four levels in the sequences shown here:
4 |
Database |
Can be enabled or disabled by configuration parameter. |
3 |
Table |
Can be enabled or disabled by SQL statement. |
2 |
Partition |
Can be enabled or disabled by SQL statement. |
1 |
Column |
Can be enabled or disabled by SQL statement. |
This level of granularity offers a high degree of flexibility: if persistent memory is applied at a certain level it is automatically inherited at lower levels of the sequence but can also be overridden at lower levels.
At the highest configuration level (database), persistent memory is managed by setting the
table_default configuration parameter for the database as a whole. You can switch
table_default to
off to enable more selective persistent memory storage at lower levels for particular tables and partitions. Refer
SAP HANA SQL and System Views Reference for more details.
Note: In order to specify different sets of vPMEM volumes for different SAP HANA tenants, use SAP Note
2175606 to first segment tenants to separate GALs. Then define the persistent memory volumes in the above .ini files at the database level.
Restart HANA after enabling Persistent Memory
IMPORTANT NOTE FOR HANA 2.0 SPS 04 REVISION 44:
Now if you are running HANA version 2.0 SPS 04 Revision 44 and you high isolation enabled, then while starting the system, your tenant database won’t start as it cannot access one of the files created under /hana/pmem/pmem0/NVM-GMD-SID-SSH. Indexserver trace file will provide you with below error -
Error while preparing NVM volumes:exception 1: no.2030116 (Basis/MemoryManager/impl/NVMFileAccess.cpp:670) TID: <TID> NVM: Could not provide path, wrong access rights <path>, rc=13: Permission denied, openRC=$orc$" or "startup failed exception 1: no.2030100 (Basis/MemoryManager/impl/NVMFileAccess.cpp:853) TID: <TID> NVM error: Could not create NVM GAL Metadata file; $path$=<path>; $sysrc$=1; $sysmsg$=Operation not permitted".
Basically the thing happening here is that in lower HANA revision (<45), when system get restarted, startup script dynamically changes the permission of /hana/pmem/pmem0/NVM-GMD-SID-SSH/SAP_NVM_GAL_MANAGER_005_T2_<hostname> file, due to which tenant OS user is not able to access that file, even if you set 777 permission to that file.
So you need to upgrade your HANA database, if you are running it in high isolation or you can run you system with low isolation.
IMPORTANT NOTE FOR HANA 2.0 SPS 04 REVISION 45:
If your HANA version is Revision 45 and it is running in high isolation, then on the very first restart after setting above configuration to enable persistent memory usage, you will encounter below error where you tenant database won't be able to start as it could not access to
SAP_NVM_GAL_Manager_005_T2_<hostname> file
[34975]{-1}[-1/-1] 2020-01-20 14:55:42.422177 e NonVolatileMemor NVMFileAccess.cpp(00153) : NVM:---------Could not open (NVM GAL metadata) '/hana/pmem/pmem0/
/NVM-GMD-SID-SSH/SAP_NVM_GAL_Manager_005_T2_<hostname>', error code: 13 (Permission denied)
[34975]{-1}[-1/-1] 2020-01-20 14:55:42.422748 f PersistenceLayer PersistenceController.cpp(00778) : startup failed exception 1: no.2030100 (Basis/MemoryManager/impl/NVMFileAccess.cpp:925) TID: 34975
NVM error: Could not create NVM GAL Metadata file; $path$=/hana/pmem/pmem0//NVM-GMD-SID-SSH/SAP_NVM_GAL_Manager_005_T2_<hostname>; $sysrc$=13; $sysmsg
$=Permission denied
Refer SAP Note
2700084 - FAQ: SAP HANA Persistent Memory, where it is mentioned that you need to give 777 permission to /hana/pmem directory. So once the file
SAP_NVM_GAL_Manager_005_T2_<hostname> is generated, provide 777 access recursively to entire folder /hana/pmem and restart system again. Your tenant database running on high isolation will come up.
NOTE: In case your tenant database (with high isolation) is not coming up even after providing 777 file permission, kindly check the trace file for more information
IMPORTANT NOTE FOR HANA 2.0 SPS 04 REVISION 46:
For the first time, when you restart your HANA database after enabling persistent memory, your tenant database (with high isolation) will not start and will get the same issue as mentioned for Revision 45.
But instead of giving 777 recursive permission to entire folder /hana/pmem, kindly just provide 666 permission to below file and then your tenant database will come up.
/hana/pmem/pmem0/NVM-GMD-SID-SGH/SAP_NVM_GAL_Manager_005_T2_<hostname>
CAUTION: With each revision, SAP is rectifying the issue that are been encountered using persistent memory. So if you are using latest HANA revision (>46), you might not get any of the above issues.
Verifying vPMEM usage in HANA
On starting HANA system with persistent memory, it will start loading all the main fragments of Column Store in persistent memory. As main fragment of column store represent most of the data, you will notice that your /hana/pmem/pmem0 filesystem will be filed up.
NOTE: Not all data will be loaded when you start the system, but eventually when application starts running you will find that most of the data will be loaded into this persistent memory
Now when you check HANA studio, you will find that the Used Memory consumption by HANA is now lowered as this only represent - Delta Store, Working Space and Row Store
Download latest
SQL Statement Collection for SAP HANA and run Memory Overview to identify different HANA memory usage
The following query can be used to verify that the vPMEM-based filesystems are utilized by SAP HANA as expected:
hdbsql> select * from M_PERSISTENT_MEMORY_VOLUMES where PORT=3<nn><yy>
As this blog tends out to be longer then I anticipated, I have just included configuration part here. I will probably write a new blog to provide you more insight on the difference in start-up time after enabling persistent memory.
References
Always refer latest guide and SAP Note from IBM and SAP to configure vPMEM solution for your landscape
Important Points
- Only "fsdax" mode is employed for SAP HANA
- vPMEM Volumes cannot be resized. Instead, they are deleted and new vPMEM volumes can be created with the desired size.
- vPMEM volumes maintain their content during application restart and LPAR restart, whereas powering down physical server won't retain data in persistent memory.
NOTE: Powering down the physical system in a PowerVM is relatively infrequent event.
- vPMEM volumes are defined per LPAR and are not directly sharable or transferable to other LPARs.
- Currently you cannot perform live migration of LPAR from one physical server to another if you are using vPMEM.
- In order to manage vPMEM volumes on existing HANA sever, you have to deactivate your LPAR.
- Ratio restriction between DRAM and PMEM does not apply to power platform
- PowerVM hypervisor should allocate DRAM exclusively from one NUMA node to server as single vPMEM volume.
Regards,
Dennis Padia