cancel
Showing results for 
Search instead for 
Did you mean: 

Why would HANA db disconnect take place during backup copy procedure?

Former Member
0 Kudos
698

We run on ECC 6.0 EHP 7 Suite on premise on HANA 1.00.122.12 VM SLES. We also replicate to a backup DB for DR purposes. During off hours we backup the DB to disk then copy to another storage location for final backup. During the copy (CP command) procedure there is frequent escalation of both CPU & Memory. Unfortunately, nearly daily at the same time, there is also a momentary disconnect to the database. Any jobs running during that time will fail as a result.The indexserver trace file often reports a timeout and broken connection, but not always. SM21 shows job failures, but not the same job daily. This started happening just over a month ago (APR 3), but we cannot relate it to any change that took place in our system environment.We have checked VM, Network, System, Backup, HANA, and Storage statistics and have run traces. We have opened incident with SAP, but have no answers after all of this.Anyone experiencing the same or similar?
View Entire Topic
gary_conn
Discoverer
0 Kudos

Hi Lars. This is Gary Conn, the DBA working with Mark.

In an nutshell, if we knew the answers to your questions, we would not be here. That said...

Since we upgraded HANA from 85.03 to 122.12, we have been getting small memory "spikes" on a semi-regular basis where we did not get them before we upgraded HANA. We are waiting on SAP to help, but so far, nothing. We have sent them the trace and RTE files as requested, but they have not found any "smoking gun". Also, when we are running our database backups, a spike is generated in memory and CPU; when we moved the backup to a different time, it followed. We do a local disk backup using the HANA native backup and then we use a very basic Linux cp command to copy the 400+GB (total size) files over to a Windows server we use for storing backups (then off to tape from there). We have been doing this for the last 2.5 years with no issues, until now. The memory and CPU spikes happen after the HANA backup and copy command (about 40 minutes into the cp command) and stop about 10-15 minutes later (memory goes back to normal after increasing by about 20%); the copy to the Windows server takes about 1.5 hours. We have HANA sync system replication running also to a local site over a 10GB pipe; first using the new logreplay, then switched to delta_datashipping; I am trying to see if different modes of HSSR is causing the spikes (still working on it).

Have you or anyone else you know experienced this before?

Thoughts?

Thanks.

Gary Conn

lbreddemann
Active Contributor
0 Kudos

Hi Gary

the move from HANA rev. 85 to 122 is rather huge and a lot of how HANA works internally had been changed during the years between these versions. So, without seeing what HANA components allocate what memory it's hard to say, why the memory spikes occur.

What's clear is that there is no direct functional connection between the use of the cp-command and HANA's memory usage. From the description, it sounds as if it could be a side-effect of file system buffering due to the cp command usage. Are you using a samba share to access the MS Windows system?

Former Member
0 Kudos

We do not use a samba share. Through further testing we have determined that the copy is not causing this issue. We agree with your statement. We have documented that a particular job that was running before the upgrade with no changes to it, is associated with the pronounced regular spiking.

I've included an image to provide a view of dramatic before/after operation of the HANA DB.hana-spiking-2.png Again, the gist of seeking assistance is to identify why HANA would acknowledge a connection loss (at any time) with higher demand. We have changed HANA parameters such as tcp_backlog and indexserver maxchannels, but have not seen any definitive result.

We'll keep looking for answers. Thanks.