on 2016 Jun 08 7:55 AM
Hi gurus,
After unexpected server shutdown HANA is not starting. Getting a little more deeper in the logs I´ve found the error listet bellow, it happens in nameserver startup process. Any hint?
hana001:/usr/sap/HDB/SYS/exe/hdb> ./hdbnameserver
service startup...
accepting requests at 127.0.0.1:30201; 127.0.0.2:30201
searching for master nameserver hana001:30201 ...
assign as master nameserver. assign to volume 1 started
checking for recovery request ...
loading topology ...
opening persistence ...
assign failed with persistence startup error. exception 1: no.3020046 (DataAccess/PageAccess/impl/PageImpl.cpp:399)
Wrong savepoint version: Expected 8373 but found 8361.
exception throw location:
1: 0x00007fb55f11bbbc in PageAccess::Page::verifyHeader(PageAccess::SizeClass, DataAccess::SavepointVersion const&, bool) const+0x3d8 at PageImpl.cpp:399 (libhdbdataaccess.so)
2: 0x00007fb55ee80da7 in DataAccess::SavepointImpl::loadRestartPage(PageAccess::PageIO&, unsigned long, bool, bool)+0x693 at Page.hpp:277 (libhdbdataaccess.so)
3: 0x00007fb55ee0205f in DataAccess::PersistenceManagerImpl::prepareOpen(unsigned long, bool, bool)+0x4b at PersistenceManagerImpl.cpp:5385 (libhdbdataaccess.so)
4: 0x00007fb55ee0223c in DataAccess::PersistenceManager::open(ltt::refcounted_handle<DataAccess::PersistenceConfiguration> const&, bool)+0x88 at PersistenceManagerImpl.cpp:2461 (libhdbdataaccess.so)
5: 0x00007fb56c510771 in PersistenceLayer::PersistenceSystem::initialize(NameServer::ServiceStartInfo const&, bool, PersistenceLayer::PERSISTENCE_MODE)+0x4a0 at PersistenceSystem.cpp:392 (libhdbpersistence.so)
6: 0x00007fb56c5437bc in PersistenceLayer::PersistenceFactory::initPersistence(PersistenceLayer::PERSISTENCE_MODE, ltt::releasable_handle<DataAccess::LoggerFactory>&, DataAccess::TransactionCallback*, NameServer::ServiceStartInfo&, ltt::refcounted_handle<TransactionManager::TransactionControlBlockFactory>&, bool)+0x158 at PersistenceFactory.cpp:394 (libhdbpersistence.so)
7: 0x00007fb56c43375a in PersistenceController::startup(PersistenceLayer::PERSISTENCE_MODE, NameServer::ServiceStartInfo*, bool, DataAccess::TablePreloadWriteCallback*, DataAccess::TablePreloadReadCallback*, Backup::RecoverCbc_Federation*)+0x1256 at PersistenceController.cpp:559 (libhdblogger.so)
8: 0x00007fb5751eedd5 in NameServer::Topology::initPersistence(NameServer::ServiceStartInfo&, bool, bool, TREX_ERROR::TRexError*, bool, ltt_adp::basic_string<char, ltt::char_traits<char>, ltt::integral_constant<bool, true> >, NameServer::ServiceStartInfo::RequestAction)+0x421 at Topology.cpp:285 (libhdbns.so)
9: 0x00007fb5752fb0c4 in NameServer::TREXNameServer::loadTopology(NameServer::LoadTopologyMode, NameServer::ServiceStartInfo&, Backup::Backup_ExtendedRecoveryInformation*)+0x5b0 at TREXNameServer.cpp:18396 (libhdbns.so)
10: 0x00007fb575305c23 in NameServer::TREXNameServer::assign(NameServer::ServiceStartInfo&)+0x2790 at TREXNameServer.cpp:2123 (libhdbns.so)
11: 0x00007fb5762d4f17 in TRexAPI::TREXIndexServer::assign(NameServer::ServiceStartInfo&, bool, TREX_ERROR::TRexError&)+0x93 at TREXIndexServer.cpp:992 (hdbnameserver)
12: 0x00007fb57630c129 in TRexAPI::AssignThread::run(void*)+0x35 at TREXIndexServer.cpp:543 (hdbnameserver)
13: 0x00007fb56cd03db5 in TrexThreads::PoolThread::run()+0x831 at PoolThread.cpp:389 (libhdbbasement.so)
14: 0x00007fb56cd056f0 in TrexThreads::PoolThread::run(void*&)+0x10 at PoolThread.cpp:165 (libhdbbasement.so)
15: 0x00007fb55471a9f0 in Execution::Thread::staticMainImp(void**)+0x700 at Thread.cpp:461 (libhdbbasis.so)
16: 0x00007fb55471bfc8 in Execution::Thread::staticMain(void*)+0x34 at ThreadMain.cpp:26 (libhdbbasis.so)
stopping service...
persistence initialization failed -> stopping instance ...
cannot send signal. (2, No such file or directory)
prepare for shutting service down...
stop ClockMonitor thread...
stop MasterTokenLockWriter /usr/sap/HDB/SYS/global//hdb/nameserver.lck thread...
setInactive(nameserver@hana001:30201)
hana001:/usr/sap/HDB/SYS/exe/hdb>
Also bellow HDB start log:
hana001:/usr/sap/HDB/HDB02> ./HDB start
StartService
Impromptu CCC initialization by 'rscpCInit'.
See SAP note 1266393.
OK
OK
Starting instance using: /usr/sap/HDB/SYS/exe/hdb/sapcontrol -prot NI_HTTP -nr 02 -function StartWait 2700 2
08.06.2016 03:44:24
Start
OK
08.06.2016 03:44:46
StartWait
FAIL: process hdbdaemon HDB Daemon not running
Final part from Daemon trace:
| [17502]{-1}[-1/-1] 2016-06-08 03:44:28.545057 i Daemon | TrexDaemon.cpp(03513) : stdout = /usr/sap/HDB/HDB02/hana001/trace/xsuaaserver.out |
| [17502]{-1}[-1/-1] 2016-06-08 03:44:28.545062 i Daemon | TrexDaemon.cpp(03513) : stderr = |
| [17502]{-1}[-1/-1] 2016-06-08 03:44:28.545066 i Daemon | TrexDaemon.cpp(03513) : maxstdfiles = 2 |
| [17502]{-1}[-1/-1] 2016-06-08 03:44:28.545071 i Daemon | TrexDaemon.cpp(03513) : runlevel = 4 |
| [17502]{-1}[-1/-1] 2016-06-08 03:44:28.545075 i Daemon | TrexDaemon.cpp(03513) : flags = |
| [17502]{-1}[-1/-1] 2016-06-08 03:44:28.545080 i Daemon | TrexDaemon.cpp(03513) : window = |
| [17502]{-1}[-1/-1] 2016-06-08 03:44:28.545084 i Daemon | TrexDaemon.cpp(03513) : isolation = 0 |
| [17502]{-1}[-1/-1] 2016-06-08 03:44:28.545088 i Daemon | TrexDaemon.cpp(03513) : userId = 4294967295 |
| [17502]{-1}[-1/-1] 2016-06-08 03:44:28.545093 i Daemon | TrexDaemon.cpp(03513) : groupId = 4294967295 |
| [17502]{-1}[-1/-1] 2016-06-08 03:44:28.545104 i Daemon | Daemon.cpp(00666) : runlevel 0 completely started |
| [17502]{-1}[-1/-1] 2016-06-08 03:44:28.545115 i Daemon | Program.cpp(00173) : line up of program group mdcdispatcher to instances <none>. |
| [17502]{-1}[-1/-1] 2016-06-08 03:44:28.545126 i Daemon | Program.cpp(00173) : line up of program group nameserver to instances 0. |
| [17502]{-1}[-1/-1] 2016-06-08 03:44:28.545460 i Daemon | TrexDaemon.cpp(02078) : starting program at '/usr/sap/HDB/HDB02/exe/hdbnameserver' with args '' |
| [17502]{-1}[-1/-1] 2016-06-08 03:44:28.569590 i Daemon | RunningInstance.cpp(00191) : start 'hdbnameserver' as process 17518 |
| [17502]{-1}[-1/-1] 2016-06-08 03:44:35.313937 i Daemon | SignalsUNIX.cpp(00647) : signo 3 SIGQUIT from user errno 0 code 0 |
| [17502]{-1}[-1/-1] 2016-06-08 03:44:35.314034 i Daemon | SignalsUNIX.cpp(00647) : sender pid 17518 real user id 1000 executable '/hana/shared/HDB/exe/linuxx86_64/HDB_1.00.112.02.1459851171_2840326/hdbnameserver' |
| [17502]{-1}[-1/-1] 2016-06-08 03:44:35.314167 i Daemon | TrexDaemon.cpp(03558) : got shutdown event (stop) |
| [17502]{-1}[-1/-1] 2016-06-08 03:44:35.314228 i Daemon | Daemon.cpp(00752) : comment file contains: nameserver: persistence initialization failed |
| [17502]{-1}[-1/-1] 2016-06-08 03:44:35.314836 i Daemon | RunningInstance.cpp(00214) : stop process hdbnameserver with pid 17518 |
| [17502]{-1}[-1/-1] 2016-06-08 03:44:35.315143 i Daemon | TrexDaemon.cpp(02598) : stopped child with pid 17518 (17518) |
| [17502]{-1}[-1/-1] 2016-06-08 03:44:35.401872 i Daemon | TrexDaemon.cpp(03778) : process hdbnameserver with pid 17518 exited normally with status 1 |
| [17502]{-1}[-1/-1] 2016-06-08 03:44:35.401939 i Daemon | TrexDaemon.cpp(03850) : all instances in runlevel 1 stopped |
| [17502]{-1}[-1/-1] 2016-06-08 03:44:35.401952 i Daemon | Program.cpp(00173) : line up of program group dummyserver to instances <none>. |
| [17502]{-1}[-1/-1] 2016-06-08 03:44:35.402492 i Daemon | ProgramStarter.cpp(00229) : writing started programs file /usr/sap/HDB/HDB02/hana001/lock/started_programs.txt: |
| [17502]{-1}[-1/-1] 2016-06-08 03:44:35.528876 i Daemon | TrexDaemon.cpp(01616) : cleaning all |
| [17502]{-1}[-1/-1] 2016-06-08 03:44:35.528931 i Daemon | TrexDaemon.cpp(02681) : found 8 segments in use with maximum id 8 |
| [17502]{-1}[-1/-1] 2016-06-08 03:44:35.529017 i Daemon | TrexDaemon.cpp(02705) : segment 4, key 0x9f08cf81, id 3506180, sequence 107, owner 1000/1001, size 117, atime 2016-06-08 03:44:32, dtime 2016-06-08 03:44:35, ctime 2016-06-08 03:18:33, created by pid 17077, last changed by pid 17077, nattach 0 |
| [17502]{-1}[-1/-1] 2016-06-08 03:44:35.529056 i Daemon | TrexDaemon.cpp(02705) : segment 5, key 0x00000000, id 3538949, sequence 108, owner 1000/1001, size 1060911, atime 2016-06-08 03:44:35, dtime 2016-06-08 03:44:35, ctime 2016-06-08 03:18:33, created by pid 17077, last changed by pid 17077, nattach 1 (deleted) |
| [17502]{-1}[-1/-1] 2016-06-08 03:44:35.529069 i Daemon | TrexDaemon.cpp(02705) : segment 6, key 0x00000000, id 3571718, sequence 109, owner 1000/113, size 1024, atime 2016-06-08 03:44:26, dtime 2016-06-08 03:44:26, ctime 2016-06-08 03:44:25, created by pid 17494, last changed by pid 17494, nattach 1 |
| [17502]{-1}[-1/-1] 2016-06-08 03:44:35.529157 i Daemon | Main.cpp(01530) : exit hdbdaemon |
| [17502]{-1}[-1/-1] 2016-06-08 03:44:35.529171 i Daemon | Main.cpp(01541) : Success: /sapmnt/ld7272/a/HDB/jenkins_prod/workspace/FA_CO_LIN64GCC48HAPPY_rel_fa~newdb100_rel/sys/src/TrexDaemon/Main.cpp:1541 in "int main(int, char**)". Leaving the "main" |
| [17502]{-1}[-1/-1] 2016-06-08 03:44:35.530121 i Network | NetworkListener.cpp(00805) : closing listen socket 7 bound to 127.0.0.1 |
| [17502]{-1}[-1/-1] 2016-06-08 03:44:35.530283 i Network | NetworkListener.cpp(00805) : closing listen socket 8 bound to 127.0.0.2 |
Request clarification before answering.
Have you first checked if any PID's are still running? Looking at the screenshots I get the idea your system still has processes running in the background (especially on the last message). As this is not Production or is it live, you can run "HDB kill" on the OS to make sure everything is down before trying again.
The lock files was based on your earlier screenshot where it had an ID mismatch. They are generally found in /sapmnt/<SID>/HDB<Instance#>/<host>/lock
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
| User | Count |
|---|---|
| 15 | |
| 9 | |
| 6 | |
| 4 | |
| 4 | |
| 4 | |
| 3 | |
| 2 | |
| 2 | |
| 2 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.