on 2015 Jun 12 1:43 PM
I'm trying to investigate a server suffering very occasional freeze-ups where the whole database becomes unresponsive. The customer has been resolving the issue by restarting the db engine, but with over 100 users this is hardly a long term solution. Occasionally the database frees itself up after some minutes, before a restart has been initiated. Trying to investigate at the time the freeze-up happens isn't possible as the system is, well, frozen and you can't connect by any method.
I have put on request level logging for SQL and BLOCKS in the hope that it may show if a long running process is blocking everything else. In the most recent freeze-up, the system was paused for around 13 minutes and then came back to life. This is reflected in the request log (viewed with sa_get_request_times() ) that shows a gap in activity at the relevant time (and lots of statements that all took approx 800,000 milliseconds - ie 13.5 minutes - to complete). What I can't see in that output is any data relating to blocking (or its absence) - the output from the procedure doesn't seem to have any relevant information. Nor does the only other procedure I am aware of for analysing request log data - sa_get_request_profile.
So in summary - having enabled request level logging of BLOCKS - how do I use it?
Many thanks.
v10.0.1.4310 (yes I know - but I don't see anything in v16 either)
After running the sa_get_request_times
procedure there should be a global temporary table (owned by the DBO user) called satmp_request_block
that contains the information you're looking for.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Thanks Mikel - I've checked that output and the good news is that my issue isn't being caused by blocking - the bad news is I now have no clue as to what else might be doing it 😞
I've added a note to DCX about that table as it isn't mentioned at all in the help documentation.
Are any of the statements that take 13 minutes to complete expected to take that long? If the multiprogramming level is low enough, all of the workers could have requests assigned to them (which would ‘block’ new requests until one of them completed).
What is the -gn value currently set to? When the server is running normally, what does SELECT PROPERTY('ActiveReq')
return?
If you would like to troubleshoot the issue while the server is unresponsive, having a connection open with the dedicated_task option set to ‘On’ may help.
How many logical processors are being used by SQL Anywhere? Version 10 is end of support, and there are some issues regarding intra-query parallelism that have been fixed in 12 & 16, but not 10. If this was the case, setting max_query_tasks=1 may help.
...and please note that "ActiveReq" will not count worker threads used for proxy connections (or maybe for external procedures, don't know for v10 as that does not have external environments like v11 and above) - compare that other question.
Of course that will not matter if it's not an issue with the MPL at all.
Thanks for following this up. Nothing should take anything like 13 minutes or even 13 seconds. There are 4 logical processors (it's a virtual machine), and there is no explicit -gn setting so it will presumably use the default of 20.
Causal checking of normal usage puts active tasks between 2 & 7 (average 2.4). There aren't any remote servers in use.
I've put an event in place to track the number of active tasks and I'll see if that sheds any light on the issue.
(In the meantime I'll see what I can do to accelerate the plan for moving to v16!)
What I have seen so far for SQLA to become temporarily unresponsive was a full table scan on a huge table or the growth of the database file blocking until the growing finished...So in a VM environment I would have a look at IO.
It might be a good idea to check if the VM infrastructure logged any events at the time of freeze up. If someone takes a snapshot of the VM or the VM is moved between physical hosts (vMotion in VMWare), the database would be unresponsive.
User | Count |
---|---|
66 | |
10 | |
10 | |
10 | |
10 | |
8 | |
6 | |
5 | |
4 | |
4 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.