on 2018 Mar 21 10:06 AM
I'm receiving a lot "Thread deadlocked occurred" in the messages log. I'm trying to figure out what could be the best combination of Multiprogramming level (MPL) configuration and max_query_task for my database servers. There are several DBs running on Windows servers with 8 CPU. The MPL varies up and down indicating the server is unable to correctly determine the best value.
The problem is when the "thread deadlock" arrive, the server adjust incrementing the MPL but in some points this situation turn worst, causing a bottleneck and receiving the error "Connection terminated abnormally; error code 121". I even think that some threads could be being orphaned or hanged that we have to setup the option "Disconnects inactive connections" (-ti) in 5 minutes.
These are the actual values:
SQL Anywhere Network Server Version 12.0.1.4436, Processors detected: 8, Processor limit (licensed processors): 7, Multiprogramming level: minimum:7, current:20, maximum:80, AutoMultiProgrammingLevel = 1 (Automatic tuning of multiprogramming level is enabled), max_query_tasks = 0 (I'm thinking setup in 4 following best practice with 8 processors)
Thanks for your comments.
Request clarification before answering.
There is a big difference between "not enough threads to satisfy the load" and "all threads are blocked".
For example, it is possible to have dreadfully slow performance because the multiprogramming level is so low that hundreds of connections are waiting for threads, without getting "all threads are blocked":
The reason you don't get "all threads are blocked" is because there are a few threads actually doing work... "all threads are blocked" means just that, all the threads are blocked.
Glenn Paulley's article What exactly is thread deadlock? is worth reading.
Important points to note: "All threads are blocked" is not easy to get, it isn't really "deadlock" in the sense of "deadly embrace", and chances are good that isolation level is involved.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
AFAIK there is no such message text as "Thread deadlock occurred".
Please tell us exactly what the message says, and the SQLCODE or SQLSTATE.
There are two likely culprits, both vastly different in their usual causes (the second one usually isn't a "deadlock", just a "running out of resources")...
Deadlock detected Error constant SQLE_DEADLOCK ODBC 2 state 40001 ODBC 3 state 40001 Severity 13 SQLCODE -306 SQLSTATE 40001 Sybase error code 1205 You attempted to read or write a row and it is locked by another user. Also, the other user is blocked directly or indirectly on your own transaction. This is a deadlock situation and your transaction has been chosen as the one to rollback. All threads are blocked Error constant SQLE_THREAD_DEADLOCK ODBC 2 state 40001 ODBC 3 state 40001 Severity 13 SQLCODE -307 SQLSTATE 40W06 Sybase error code 1205 You attempted to read or write a row and it is locked by another user. Also, all other threads (see the -gn server option) are blocked waiting for a lock to be released. This is a deadlock situation and your transaction has been chosen as the one to rollback.
If it's the SQLCODE -307 then first try increasing the -gn. Here's a tip from the upcoming Foxhound 5 Help:
Performance Tip: If Alert #1 and All Clear #1 messages appear repeatedly, along with up-and-down changes in the multiprogramming level (Max Req), that may indicate the server is thrashing while it tries to determine the best value of Max Req.
If the computer is dedicated to running this SQL Anywhere server and no other process of importance, consider disabling the dynamic tuning of the multiprogramming level by specifying the dbsrv -gna 0 and -gn options to permanently set a fixed value. With a dedicated computer there may be no need for SQL Anywhere to perform extra work changing the multiprogramming level up and down.
You can temporarily disable the dynamic tuning of the multiprogramming level as follows:
-- Set fixed multiprogramming level. CALL sa_server_option ( 'AutoMultiProgrammingLevel', 'NO' ); CALL sa_server_option ( 'MinMultiProgrammingLevel', '100' ); CALL sa_server_option ( 'MaxMultiProgrammingLevel', '100' ); CALL sa_server_option ( 'CurrentMultiProgrammingLevel', '100' ); SELECT @@VERSION, PROPERTY ( 'AutoMultiProgrammingLevel' ) AS "-gna", PROPERTY ( 'MultiProgrammingLevel' ) AS "-gn", PROPERTY ( 'MinMultiProgrammingLevel' ) AS "-gnl", PROPERTY ( 'MaxMultiProgrammingLevel' ) AS "-gnh", PROPERTY ( 'CurrentMultiProgrammingLevel' ) AS "current"; @@VERSION, -gna, -gn, -gnl, -gnh, current '16.0.0.2512', '0', '100', '100', '100', '100' |
-- Enable dynamic tuning of multiprogramming level. CALL sa_server_option ( 'AutoMultiProgrammingLevel', 'YES' ); CALL sa_server_option ( 'MinMultiProgrammingLevel', '8' ); CALL sa_server_option ( 'MaxMultiProgrammingLevel', '100' ); CALL sa_server_option ( 'CurrentMultiProgrammingLevel', '100' ); SELECT @@VERSION, PROPERTY ( 'AutoMultiProgrammingLevel' ) AS "-gna", PROPERTY ( 'MultiProgrammingLevel' ) AS "-gn", PROPERTY ( 'MinMultiProgrammingLevel' ) AS "-gnl", PROPERTY ( 'MaxMultiProgrammingLevel' ) AS "-gnh", PROPERTY ( 'CurrentMultiProgrammingLevel' ) AS "current"; @@VERSION, -gna, -gn, -gnl, -gnh, current '16.0.0.2512', '1', '100', '8', '100', '100' |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Breck, thanks for the information, the user did not inform me of any errors, I'm just seeing these messages in the message log and, for my opinion, it is a coincidence that error 121 occurs after a thread deadlock. I'm trying to minimize these abnormal disconnections by preventing processes from handling much intra-query parallelism and the MPL going higher than the server can handle.
Here is an extract of the messagelog (-o messagelog):
Command: findstr "parallelism error Adjusting deadlocked Multiprogramming adjusted warning caching" messagelog
I. 03/19 15:41:45. Adjusting the multiprogramming level to 7 I. 03/19 15:45:42. Multiprogramming level increased by 1 to handle intra-query parallelism I. 03/19 15:45:42. Adjusting the multiprogramming level to 18 I. 03/19 15:45:42. Thread deadlocked occurred. Adjusted the multiprogramming level to 18 I. 03/19 15:47:23. Adjusting the multiprogramming level to 7 I. 03/19 15:48:37. Multiprogramming level increased by 1 to handle intra-query parallelism I. 03/19 15:48:37. Adjusting the multiprogramming level to 18 I. 03/19 15:48:37. Thread deadlocked occurred. Adjusted the multiprogramming level to 18 I. 03/20 02:00:45. Cache size adjusted to 825212K I. 03/20 02:01:45. Cache size adjusted to 830216K I. 03/20 02:02:45. Cache size adjusted to 831700K I. 03/20 02:03:45. Cache size adjusted to 831844K I. 03/20 14:27:40. Adjusting the multiprogramming level to 7 I. 03/20 14:28:10. Multiprogramming level increased by 2 to handle intra-query parallelism I. 03/20 14:28:10. Multiprogramming level increased by 1 to handle intra-query parallelism I. 03/20 14:28:10. Adjusting the multiprogramming level to 20 I. 03/20 14:28:10. Thread deadlocked occurred. Adjusted the multiprogramming level to 20 I. 03/20 14:28:21. Connection terminated abnormally; error code 121 I. 03/20 15:13:00. Adjusting the multiprogramming level to 7 I. 03/20 15:13:04. Adjusting the multiprogramming level to 17 I. 03/20 15:13:04. Thread deadlocked occurred. Adjusted the multiprogramming level to 17 I. 03/20 15:14:11. Adjusting the multiprogramming level to 10 I. 03/20 15:14:45. Adjusting the multiprogramming level to 8 I. 03/20 15:14:46. Adjusting the multiprogramming level to 18 I. 03/20 15:14:46. Thread deadlocked occurred. Adjusted the multiprogramming level to 18 I. 03/20 15:15:22. Connection terminated abnormally; error code 121 I. 03/20 15:15:31. Connection terminated abnormally; error code 121 I. 03/21 02:04:29. Adjusting the multiprogramming level to 7 I. 03/21 07:36:43. Multiprogramming level increased by 1 to handle intra-query parallelism I. 03/21 07:36:43. Adjusting the multiprogramming level to 18 I. 03/21 07:36:43. Thread deadlocked occurred. Adjusted the multiprogramming level to 18 I. 03/21 09:28:34. Adjusting the multiprogramming level to 7 I. 03/21 09:29:57. Multiprogramming level increased by 1 to handle intra-query parallelism I. 03/21 09:29:57. Adjusting the multiprogramming level to 18 I. 03/21 09:29:57. Thread deadlocked occurred. Adjusted the multiprogramming level to 18 I. 03/21 13:56:32. Cache size adjusted to 832748K I. 03/21 13:57:32. Cache size adjusted to 832976K I. 03/21 15:42:38. Cache size adjusted to 832908K I. 03/21 16:00:39. Cache size adjusted to 832824K I. 03/21 16:10:39. Cache size adjusted to 832752K I. 03/21 17:10:42. Cache size adjusted to 832632K I. 03/21 17:12:42. Cache size adjusted to 832548K I. 03/21 17:54:44. Cache size adjusted to 832476K I. 03/22 02:04:49. Adjusting the multiprogramming level to 7 I. 03/22 08:01:45. Multiprogramming level increased by 1 to handle intra-query parallelism I. 03/22 08:01:45. Adjusting the multiprogramming level to 18 I. 03/22 08:01:45. Thread deadlocked occurred. Adjusted the multiprogramming level to 18
I didn't setup all DBs monitoring with Foxhound but in the DBs that I have didn't receive the Alert#1 "unresponsive server", some times we get samples interval with "unscheduled request" but not arrive the Alert#14.
Have you tried to find out what connections make use of intra-query parallelism and to prevent that by using max_query_tasks = 1?
Where are these messages coming from?
I suspect they are coming from MESSAGE statements in application programs or user-written stored procedures.
I am fairly certain they are NOT coming from the SQL Anywhere server, especially the one with the spelling mistake "Thread deadlocked occurred".
A Hexplorer binary search of dbserv16.exe and an empty V16 *.db file did NOT find any occurrence of "deadlocked". The closest match was "Thread deadlock detected, see -gn option".
The following SELECTs do a search on all stored procedure code...
BEGIN SELECT * FROM SYSTABLE WHERE view_def LIKE '%deadlocked%'; SELECT * FROM SYSPROCEDURE WHERE proc_defn LIKE '%deadlocked%'; SELECT * FROM SYSTRIGGER WHERE trigger_defn LIKE '%deadlocked%'; SELECT * FROM SYSEVENT WHERE source LIKE '%deadlocked%'; END;
Arrggh! I missed that string entirely... Hexplorer searches are case-sensitive (of course) and I was searching for "Thread" (stupid)...
...BUT actually the string in dbserv16.dll is this: "thread deadlock occured. Current num="
The grammer, er, grammar may be OK but the spelling isn't: occured needs two r's 🙂
Ok, Volker: I'm preparing a test environment to play with the MPL. The connections that use intra-query parallelism are randomly, it depend of the workload. I have a dedicated server to 1 DB in production, I configured the max_query_tasks option in 4 and I think that is helping to lead less thread deadlock but at the end everything depends on the concurrence and throughput.
Breck: I have setup this parameters in the start up (-zp -zl -zt -gns -cs -o d:\\PATH\\messagelog) so I get details in the messagelog file, so them coming from the SQL Anywhere (Ver 12.0.1). I don't know why the spelling is wrong but definitely are coming from SQL Anywhere, did you check the dbsrv12.exe?
I searched dbserv12.dll for "hread dead" in both the Bin32 and Bin64 folders, with only the following hits (in different order in the two binaries)...
Bin32 Thread deadlock detected, see -gn option thread deadlock occured. Current num= Bin64 thread deadlock occured. Current num= Thread deadlock detected, see -gn option
Google gives zero hits for "Thread deadlocked occurred"... which would be impossible IMO, if SQL Anywhere produces this message.
Did you search the SYSPROCEDURE and other tables?
User | Count |
---|---|
33 | |
21 | |
16 | |
8 | |
7 | |
6 | |
5 | |
4 | |
4 | |
4 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.