cancel
Showing results for 
Search instead for 
Did you mean: 
Read only

Error starting database / nameserver

Former Member
0 Likes
11,127

Hi gurus,

After unexpected server shutdown HANA is not starting. Getting a little more deeper in the logs I´ve found the error listet bellow, it happens in nameserver startup process. Any hint?

hana001:/usr/sap/HDB/SYS/exe/hdb> ./hdbnameserver

service startup...

accepting requests at 127.0.0.1:30201; 127.0.0.2:30201

searching for master nameserver hana001:30201 ...

assign as master nameserver. assign to volume 1 started

checking for recovery request ...

loading topology ...

opening persistence ...

assign failed with persistence startup error. exception  1: no.3020046  (DataAccess/PageAccess/impl/PageImpl.cpp:399)

    Wrong savepoint version: Expected 8373 but found 8361.

exception throw location:

1: 0x00007fb55f11bbbc in PageAccess::Page::verifyHeader(PageAccess::SizeClass, DataAccess::SavepointVersion const&, bool) const+0x3d8 at PageImpl.cpp:399 (libhdbdataaccess.so)

2: 0x00007fb55ee80da7 in DataAccess::SavepointImpl::loadRestartPage(PageAccess::PageIO&, unsigned long, bool, bool)+0x693 at Page.hpp:277 (libhdbdataaccess.so)

3: 0x00007fb55ee0205f in DataAccess::PersistenceManagerImpl::prepareOpen(unsigned long, bool, bool)+0x4b at PersistenceManagerImpl.cpp:5385 (libhdbdataaccess.so)

4: 0x00007fb55ee0223c in DataAccess::PersistenceManager::open(ltt::refcounted_handle<DataAccess::PersistenceConfiguration> const&, bool)+0x88 at PersistenceManagerImpl.cpp:2461 (libhdbdataaccess.so)

5: 0x00007fb56c510771 in PersistenceLayer::PersistenceSystem::initialize(NameServer::ServiceStartInfo const&, bool, PersistenceLayer::PERSISTENCE_MODE)+0x4a0 at PersistenceSystem.cpp:392 (libhdbpersistence.so)

6: 0x00007fb56c5437bc in PersistenceLayer::PersistenceFactory::initPersistence(PersistenceLayer::PERSISTENCE_MODE, ltt::releasable_handle<DataAccess::LoggerFactory>&, DataAccess::TransactionCallback*, NameServer::ServiceStartInfo&, ltt::refcounted_handle<TransactionManager::TransactionControlBlockFactory>&, bool)+0x158 at PersistenceFactory.cpp:394 (libhdbpersistence.so)

7: 0x00007fb56c43375a in PersistenceController::startup(PersistenceLayer::PERSISTENCE_MODE, NameServer::ServiceStartInfo*, bool, DataAccess::TablePreloadWriteCallback*, DataAccess::TablePreloadReadCallback*, Backup::RecoverCbc_Federation*)+0x1256 at PersistenceController.cpp:559 (libhdblogger.so)

8: 0x00007fb5751eedd5 in NameServer::Topology::initPersistence(NameServer::ServiceStartInfo&, bool, bool, TREX_ERROR::TRexError*, bool, ltt_adp::basic_string<char, ltt::char_traits<char>, ltt::integral_constant<bool, true> >, NameServer::ServiceStartInfo::RequestAction)+0x421 at Topology.cpp:285 (libhdbns.so)

9: 0x00007fb5752fb0c4 in NameServer::TREXNameServer::loadTopology(NameServer::LoadTopologyMode, NameServer::ServiceStartInfo&, Backup::Backup_ExtendedRecoveryInformation*)+0x5b0 at TREXNameServer.cpp:18396 (libhdbns.so)

10: 0x00007fb575305c23 in NameServer::TREXNameServer::assign(NameServer::ServiceStartInfo&)+0x2790 at TREXNameServer.cpp:2123 (libhdbns.so)

11: 0x00007fb5762d4f17 in TRexAPI::TREXIndexServer::assign(NameServer::ServiceStartInfo&, bool, TREX_ERROR::TRexError&)+0x93 at TREXIndexServer.cpp:992 (hdbnameserver)

12: 0x00007fb57630c129 in TRexAPI::AssignThread::run(void*)+0x35 at TREXIndexServer.cpp:543 (hdbnameserver)

13: 0x00007fb56cd03db5 in TrexThreads::PoolThread::run()+0x831 at PoolThread.cpp:389 (libhdbbasement.so)

14: 0x00007fb56cd056f0 in TrexThreads::PoolThread::run(void*&)+0x10 at PoolThread.cpp:165 (libhdbbasement.so)

15: 0x00007fb55471a9f0 in Execution::Thread::staticMainImp(void**)+0x700 at Thread.cpp:461 (libhdbbasis.so)

16: 0x00007fb55471bfc8 in Execution::Thread::staticMain(void*)+0x34 at ThreadMain.cpp:26 (libhdbbasis.so)

stopping service...

persistence initialization failed -> stopping instance ...

cannot send signal. (2, No such file or directory)

prepare for shutting service down...

stop ClockMonitor thread...

stop MasterTokenLockWriter /usr/sap/HDB/SYS/global//hdb/nameserver.lck thread...

setInactive(nameserver@hana001:30201)

hana001:/usr/sap/HDB/SYS/exe/hdb>

Also bellow HDB start log:

hana001:/usr/sap/HDB/HDB02> ./HDB start

StartService

Impromptu CCC initialization by 'rscpCInit'.

  See SAP note 1266393.

OK

OK

Starting instance using: /usr/sap/HDB/SYS/exe/hdb/sapcontrol -prot NI_HTTP -nr 02 -function StartWait 2700 2

08.06.2016 03:44:24

Start

OK

08.06.2016 03:44:46

StartWait

FAIL: process hdbdaemon HDB Daemon not running

Final part from Daemon trace:


[17502]{-1}[-1/-1] 2016-06-08 03:44:28.545057 i Daemon       TrexDaemon.cpp(03513) : stdout = /usr/sap/HDB/HDB02/hana001/trace/xsuaaserver.out
[17502]{-1}[-1/-1] 2016-06-08 03:44:28.545062 i Daemon       TrexDaemon.cpp(03513) : stderr =
[17502]{-1}[-1/-1] 2016-06-08 03:44:28.545066 i Daemon       TrexDaemon.cpp(03513) : maxstdfiles = 2
[17502]{-1}[-1/-1] 2016-06-08 03:44:28.545071 i Daemon       TrexDaemon.cpp(03513) : runlevel = 4
[17502]{-1}[-1/-1] 2016-06-08 03:44:28.545075 i Daemon       TrexDaemon.cpp(03513) : flags =
[17502]{-1}[-1/-1] 2016-06-08 03:44:28.545080 i Daemon       TrexDaemon.cpp(03513) : window =
[17502]{-1}[-1/-1] 2016-06-08 03:44:28.545084 i Daemon       TrexDaemon.cpp(03513) : isolation = 0
[17502]{-1}[-1/-1] 2016-06-08 03:44:28.545088 i Daemon       TrexDaemon.cpp(03513) : userId = 4294967295
[17502]{-1}[-1/-1] 2016-06-08 03:44:28.545093 i Daemon       TrexDaemon.cpp(03513) : groupId = 4294967295
[17502]{-1}[-1/-1] 2016-06-08 03:44:28.545104 i Daemon       Daemon.cpp(00666) : runlevel 0 completely started
[17502]{-1}[-1/-1] 2016-06-08 03:44:28.545115 i Daemon       Program.cpp(00173) : line up of program group mdcdispatcher to instances <none>.
[17502]{-1}[-1/-1] 2016-06-08 03:44:28.545126 i Daemon       Program.cpp(00173) : line up of program group nameserver to instances 0.
[17502]{-1}[-1/-1] 2016-06-08 03:44:28.545460 i Daemon       TrexDaemon.cpp(02078) : starting program at '/usr/sap/HDB/HDB02/exe/hdbnameserver' with args ''
[17502]{-1}[-1/-1] 2016-06-08 03:44:28.569590 i Daemon       RunningInstance.cpp(00191) : start 'hdbnameserver' as process 17518
[17502]{-1}[-1/-1] 2016-06-08 03:44:35.313937 i Daemon       SignalsUNIX.cpp(00647) : signo 3 SIGQUIT from user errno 0 code 0
[17502]{-1}[-1/-1] 2016-06-08 03:44:35.314034 i Daemon       SignalsUNIX.cpp(00647) : sender pid 17518 real user id 1000 executable '/hana/shared/HDB/exe/linuxx86_64/HDB_1.00.112.02.1459851171_2840326/hdbnameserver'
[17502]{-1}[-1/-1] 2016-06-08 03:44:35.314167 i Daemon       TrexDaemon.cpp(03558) : got shutdown event (stop)
[17502]{-1}[-1/-1] 2016-06-08 03:44:35.314228 i Daemon       Daemon.cpp(00752) : comment file contains: nameserver: persistence initialization failed
[17502]{-1}[-1/-1] 2016-06-08 03:44:35.314836 i Daemon       RunningInstance.cpp(00214) : stop process hdbnameserver with pid 17518
[17502]{-1}[-1/-1] 2016-06-08 03:44:35.315143 i Daemon       TrexDaemon.cpp(02598) : stopped child with pid 17518 (17518)
[17502]{-1}[-1/-1] 2016-06-08 03:44:35.401872 i Daemon       TrexDaemon.cpp(03778) : process hdbnameserver with pid 17518 exited normally with status 1
[17502]{-1}[-1/-1] 2016-06-08 03:44:35.401939 i Daemon       TrexDaemon.cpp(03850) : all instances in runlevel 1 stopped
[17502]{-1}[-1/-1] 2016-06-08 03:44:35.401952 i Daemon       Program.cpp(00173) : line up of program group dummyserver to instances <none>.
[17502]{-1}[-1/-1] 2016-06-08 03:44:35.402492 i Daemon       ProgramStarter.cpp(00229) : writing started programs file /usr/sap/HDB/HDB02/hana001/lock/started_programs.txt:
[17502]{-1}[-1/-1] 2016-06-08 03:44:35.528876 i Daemon       TrexDaemon.cpp(01616) : cleaning all
[17502]{-1}[-1/-1] 2016-06-08 03:44:35.528931 i Daemon       TrexDaemon.cpp(02681) : found 8 segments in use with maximum id 8
[17502]{-1}[-1/-1] 2016-06-08 03:44:35.529017 i Daemon       TrexDaemon.cpp(02705) : segment 4, key 0x9f08cf81, id 3506180, sequence 107, owner 1000/1001, size 117, atime 2016-06-08 03:44:32, dtime 2016-06-08 03:44:35, ctime 2016-06-08 03:18:33, created by pid 17077, last changed by pid 17077, nattach 0
[17502]{-1}[-1/-1] 2016-06-08 03:44:35.529056 i Daemon       TrexDaemon.cpp(02705) : segment 5, key 0x00000000, id 3538949, sequence 108, owner 1000/1001, size 1060911, atime 2016-06-08 03:44:35, dtime 2016-06-08 03:44:35, ctime 2016-06-08 03:18:33, created by pid 17077, last changed by pid 17077, nattach 1 (deleted)
[17502]{-1}[-1/-1] 2016-06-08 03:44:35.529069 i Daemon       TrexDaemon.cpp(02705) : segment 6, key 0x00000000, id 3571718, sequence 109, owner 1000/113, size 1024, atime 2016-06-08 03:44:26, dtime 2016-06-08 03:44:26, ctime 2016-06-08 03:44:25, created by pid 17494, last changed by pid 17494, nattach 1
[17502]{-1}[-1/-1] 2016-06-08 03:44:35.529157 i Daemon       Main.cpp(01530) : exit hdbdaemon
[17502]{-1}[-1/-1] 2016-06-08 03:44:35.529171 i Daemon       Main.cpp(01541) : Success: /sapmnt/ld7272/a/HDB/jenkins_prod/workspace/FA_CO_LIN64GCC48HAPPY_rel_fa~newdb100_rel/sys/src/TrexDaemon/Main.cpp:1541 in "int main(int, char**)". Leaving the "main"
[17502]{-1}[-1/-1] 2016-06-08 03:44:35.530121 i Network      NetworkListener.cpp(00805) : closing listen socket 7 bound to 127.0.0.1
[17502]{-1}[-1/-1] 2016-06-08 03:44:35.530283 i Network      NetworkListener.cpp(00805) : closing listen socket 8 bound to 127.0.0.2
View Entire Topic
Former Member
0 Likes

Do you know if the datavolume was being exchanged  by an older datavolume? You should see something similair to

[15067]{-1}[-1/-1] 2016-05-19 13:58:22.353111 i Logger           SavepointImpl.cpp(03893) : AnchorPage with SPV=37047 loaded.  Persistence created at <DATA -- TIME>.


I would also check the var/log/messages for any suspicious activity.


What is your file system, is it XFS? If so there is a known issue with this which is detailed in SAP Note 2246163