Hi All,
My name is Man-Ted Chan and I’m from the SAP HANA product support team. Today’s blog will be about the new SAP HANA Statistics Server. We will review some background information on it, how to implement it, and what to look for to verify it was successful.
The statistics server assists customers by monitoring their SAP HANA system, collecting historical performance data and warning them of system alerts (such as resource exhaustion). The historical data is stored in the _SYS_STATISTICS schema; for more information on these tables, please view the statistical views reference page on help.sap.com/hana_appliance
The new Statistics Server is also known as the embedded Statistics Server or Statistics Service. Prior to SP7 the Statistics Server was a separate server process - like an extra Index Server with monitoring services on top of it. The new Statistics Server is now embedded in the Index Server. The advantage of this is to simplify the SAP HANA architecture and assist us in avoiding out of memory issues of the Statistics Server, as it was defaulted to use only 5% of the total memory.
In SP7 and SP8 the old Statistics Server is still implemented and shipped to customers, but can migrate to the new statistics service if they would like by following SAP note 1917938.
The following screen caps will show how to implement the new Statistics Server. I also make note of what your system looks like before and after you perform this implementation (the steps to perform the migration are listed in SAP note 1917938 as well).
In the SAP HANA Studio, view the landscape and performance tab of your system and you should see the following:
Prior to migrating to the new statistics server please take a back of your system, once that is done please do the following:
Go to the Configuration tab and expand nameserver.ini-> statisticsserver->active
Double click on the value ‘false’ and enter the new value ‘true’ into the following popup:
After pressing ‘Save’ the Configuration tab will now show the following:
Once this done check the ‘Landscape’ and ‘Performance’ tab.
As we can see there Statistics Server is now gone. Do not restart your system during this migration, to check the status of the migration please run the following:
SELECT * FROM _SYS_STATISTICS.STATISTICS_PROPERTIES where key = 'internal.installation.state'
Key | Value |
| Done (okay) since 2014-09-20 02:55:34.0360000 |
Do not restart your SAP HANA system until the migration is completed
If you run into issues implementing the new statistics server then we will need to look into the SAP HANA trace files.
Logs that we can check during the implementation of the new Statistics Server are the following:
If the deployment does not work review the trace files to pin point where an error occurred.
Below I have examples of trace snippets of a successful deployment of the embedded Statistics Service.
In the Statistics Server trace we will see the statistics server shutting down:
[27504]{-1}[-1/-1] 2014-09-20 02:55:37.669772 i Logger BackupHandlerImpl.cpp(00321) : Shutting down log backup, 0 log backup(s) pending
[27172]{-1}[-1/-1] 2014-09-20 02:55:38.340345 i Service_Shutdown TrexService.cpp(05797) : Disabling signal handler
[27172]{-1}[-1/-1] 2014-09-20 02:55:38.340364 i Service_Shutdown TrexService.cpp(05809) : Stopping self watchdog
[27172]{-1}[-1/-1] 2014-09-20 02:55:38.340460 i Service_Shutdown TrexService.cpp(05821) : Stopping request dispatcher
[27172]{-1}[-1/-1] 2014-09-20 02:55:38.340466 i Service_Shutdown TrexService.cpp(05828) : Stopping responder
[27172]{-1}[-1/-1] 2014-09-20 02:55:38.341478 i Service_Shutdown TrexService.cpp(05835) : Stopping channel waiter
[27172]{-1}[-1/-1] 2014-09-20 02:55:38.341500 i Service_Shutdown TrexService.cpp(05840) : Shutting service down
[27172]{-1}[-1/-1] 2014-09-20 02:55:38.350884 i Service_Shutdown TrexService.cpp(05845) : Stopping threads
[27172]{-1}[-1/-1] 2014-09-20 02:55:38.354348 i Service_Shutdown TrexService.cpp(05850) : Stopping communication
[27172]{-1}[-1/-1] 2014-09-20 02:55:38.356233 i Service_Shutdown TrexService.cpp(05857) : Deleting console
[27172]{-1}[-1/-1] 2014-09-20 02:55:38.356240 i Service_Shutdown TrexService.cpp(05865) : Deleting self watchdog
[27172]{-1}[-1/-1] 2014-09-20 02:55:38.356260 i Service_Shutdown TrexService.cpp(05873) : Deleting request dispatcher
[27172]{-1}[-1/-1] 2014-09-20 02:55:38.356278 i Service_Shutdown TrexService.cpp(05881) : Deleting responder
[27172]{-1}[-1/-1] 2014-09-20 02:55:38.356302 i Service_Shutdown TrexService.cpp(05889) : Deleting service
[27172]{-1}[-1/-1] 2014-09-20 02:55:38.356444 i Service_Shutdown TrexService.cpp(05896) : Deleting threads
[27172]{-1}[-1/-1] 2014-09-20 02:55:38.356449 i Service_Shutdown TrexService.cpp(05902) : Deleting pools
[27172]{-1}[-1/-1] 2014-09-20 02:55:38.356454 i Service_Shutdown TrexService.cpp(05912) : Deleting configuration
[27172]{-1}[-1/-1] 2014-09-20 02:55:38.356458 i Service_Shutdown TrexService.cpp(05919) : Removing pidfile
[27172]{-1}[-1/-1] 2014-09-20 02:55:38.356515 i Service_Shutdown TrexService.cpp(05954) : System down
In the Name Server trace you will see it being notified that the Statistics Server is shutting down and the topology is getting updated.
An error that you might encounter in the Name Server trace is the following:
STATS_CTRL NameServerControllerThread.cpp(00251) : error installing
Please review SAP note 2006652 to assist you in resolving this.
Below is a successful topology update:
[27065]{-1}[-1/-1] 2014-09-20 02:55:34.050358 i STATS_CTRL NameServerControllerThread.cpp(00489) : forcing log backup...
[27065]{-1}[-1/-1] 2014-09-20 02:55:34.051287 i STATS_CTRL NameServerControllerThread.cpp(00494) : log backup done. Reply: [OK]
--
[OK]
--
[27065]{-1}[-1/-1] 2014-09-20 02:55:34.051292 i STATS_CTRL NameServerControllerThread.cpp(00497) : stopping hdbstatisticsserver...
[27065]{-1}[-1/-1] 2014-09-20 02:55:34.054522 i STATS_CTRL NameServerControllerThread.cpp(00522) : waiting 5 seconds for stop...
[27426]{-1}[-1/-1] 2014-09-20 02:55:34.323824 i Service_Shutdown TREXNameServer.cpp(03854) : setStopping(statisticsserver@mo-517c85da0:30005)
[27065]{-1}[-1/-1] 2014-09-20 02:55:39.054777 i STATS_CTRL NameServerControllerThread.cpp(00527) : hdbstatisticsserver stopped
[27065]{-1}[-1/-1] 2014-09-20 02:55:39.054796 i STATS_CTRL NameServerControllerThread.cpp(00530) : remove service from topology...
[27065]{-1}[-1/-1] 2014-09-20 02:55:39.056706 i STATS_CTRL NameServerControllerThread.cpp(00534) : service removed from topology
[27065]{-1}[-1/-1] 2014-09-20 02:55:39.056711 i STATS_CTRL NameServerControllerThread.cpp(00536) : remove volume 2 from topology...
[27065]{-1}[-1/-1] 2014-09-20 02:55:39.058031 i STATS_CTRL NameServerControllerThread.cpp(00540) : volume removed from topology
[27065]{-1}[-1/-1] 2014-09-20 02:55:39.058038 i STATS_CTRL NameServerControllerThread.cpp(00542) : mark volume 2 as forbidden...
[27065]{-1}[-1/-1] 2014-09-20 02:55:39.059263 i STATS_CTRL NameServerControllerThread.cpp(00544) : volume marked as forbidden
[27065]{-1}[-1/-1] 2014-09-20 02:55:39.059269 i STATS_CTRL NameServerControllerThread.cpp(00546) : old StatisticsServer successfully removed
[27065]{-1}[-1/-1] 2014-09-20 02:55:39.060823 i STATS_CTRL NameServerControllerThread.cpp(00468) : removing old section from statisticsserver.ini: statisticsserver_general
[27065]{-1}[-1/-1] 2014-09-20 02:55:39.072798 i STATS_CTRL NameServerControllerThread.cpp(00473) : making sure old StatisticsServer is inactive statisticsserver.ini: statisticsserver_general, active=false
[27065]{-1}[-1/-1] 2014-09-20 02:55:39.083604 i STATS_CTRL NameServerControllerThread.cpp(00251) : installation done
[27065]{-1}[-1/-1] 2014-09-20 02:55:39.083620 i STATS_CTRL NameServerControllerThread.cpp(00298) : starting controller
The statistics service is a set of tables and SQL procedures, so if you check the index server trace you will see the deployment of version SQL procedures, and an error could occur during the SQL execution.
Here is an example of a successful deployment:
upsert _SYS_STATISTICS.statistics_schedule (id, status, intervallength, minimalintervallength, retention_days_current, retention_days_default) values (6000, 'Idle', 300, 0, 0, 0) with primary key;
END;
[27340]{-1}[-1/-1] 2014-09-20 02:55:29.802118 i TraceContext TraceContext.cpp(00718) : UserName=
[27340]{-1}[-1/-1] 2014-09-20 02:55:29.802110 i STATS_WORKER ConfigurableInstaller.cpp(00168) : creating procedure for 6000: CREATE PROCEDURE _SYS_STATISTICS.Special_Function_Email_Management (IN snapshot_id timestamp, OUT was_cancelled integer) LANGUAGE SQLSCRIPT SQL SECURITY INVOKER AS
-- snapshot_id [IN]: snapshot id
-- was_cancelled [OUT]: indicator whether the specialfunction has been cancelled
l_server string;
If you suspect that your new Statistics Service is not running you can check under the
Performance ->Threads tab
Or you can run the following query:
select * from "PUBLIC"."M_SERVICE_THREADS" where thread_type like '%ControllerThread (StatisticsServer)%'
If for some reason you need to go back to the original Statistics Server, you will not be able to just change the value of
nameserver.ini-> statisticsserver->active back to false, but you will have to perform a recovery to a time before you performed the migration.