on 2017 Apr 25 9:12 AM
Update: The title was originally "Should TRUNCATE TABLE be avoided on HA 16.0.0.2419?" but it's not just TRUNCATE.
The first SELECT on the secondary or read-only scale-out copy seems to get "Rollback occurred due to deadlock during prefetch" but the second select works.
It doesn't seem to matter if the table is already empty when the TRUNCATE TABLE runs.
...or is this a dbisql issue?
-------------------------------------- -- on primary... TRUNCATE TABLE t; -------------------------------------- -- on secondary or copy... FAILS... SELECT MAX ( inserted_at ) FROM t; Could not execute statement. Rollback occurred due to deadlock during prefetch SQLCODE=-684, ODBC 3 State="40001" Line 1, column 1 (Continuing after error) -------------------------------------- -- on secondary or copy, again... WORKS... SELECT MAX ( inserted_at ) FROM t; MAX(t.inserted_at) ----------------------- (NULL) (1 rows) Execution time: 0.005 seconds
The "Rollback occurred due to deadlock during prefetch" error is occurring when the mirror or copy node must do a rollback on a user connection since that user connection is blocking applying an operation that was applied to the primary and needs to be applied to the mirror or copy node. Re-executing the query should succeed after the TRUNCATE has been applied since the user connection is no longer blocking the TRUNCATE.
I believe DELETE gets per-row write locks as well as a shared schema lock, while TRUNCATE TABLE gets an exclusive table lock. If the isolation_level is 0, I think scanning the table would only require a shared schema lock, and thus would not block on a DELETE, but would block on a TRUNCATE TABLE.
Note that you can get "Rollback occurred due to deadlock during prefetch" error on the mirror or copy node anytime a user connection blocks applying operations already applied on the primary for a few seconds. There are cases where this can occur with a DELETE on the primary and a SELECT on the mirror or copy node. My guess is that TRUNCATE TABLE just makes this significantly more likely.
I hope this helps, Ian
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
I think this sequence could occur with:
ConnM: user connection to mirror
ConnP: user connection to primary
ConnM: starts doing: SELECT MAX ( inserted_at ) FROM t; -- this will get a shared schema lock on t
ConnP: TRUNCATE TABLE t; -- succeeds immediately
Mirror now attempts to apply "TRUNCATE TABLE t" - and blocks on shared schema lock. If it remains blocked for a few seconds, the Mirror will rollback ConnM, resulting in your error
If the "SELECT MAX ( inserted_at ) FROM t" always takes less than a second on the mirror, then off the top of my head, I can't think of the exact sequence that can cause this error.
It may not be (just) TRUNCATE, here's a simple repro...
-------------------------------------- -- dbisql on primary... SELECT @@VERSION; -- 16.0.0.2419 BEGIN DROP TABLE t; EXCEPTION WHEN OTHERS THEN END; CREATE TABLE t ( pkey BIGINT NOT NULL DEFAULT AUTOINCREMENT PRIMARY KEY, inserted_at TIMESTAMP NOT NULL DEFAULT CURRENT TIMESTAMP, padding VARCHAR ( 1000 ) NOT NULL ); CHECKPOINT; WAITFOR DELAY '00:00:10'; SELECT CURRENT TIMESTAMP, ' Step 1 done'; -------------------------------------- -- dbisql on secondary... -- user waits for "Step 1" to appear on primary WAITFOR DELAY '00:00:10'; SELECT CURRENT TIMESTAMP, ' Step 2 started'; SELECT MAX ( inserted_at ) FROM t; -- everything looks ok -------------------------------------- -- dbisql on primary... repeat -------------------------------------- -- dbisql on secondary... repeat -- result may be funky, with this in Messages tab: Could not execute statement. Rollback occurred due to deadlock during prefetch SQLCODE=-684, ODBC 3 State="40001" Line 4, column 1 (Continuing after error) Execution time: 0.002 seconds
Why does a read-only connection EVER get involved in a deadlock situation? Read-only connections might get blocked, but they don't acquire locks, so BY DEFINITION they can't be deadlocked?
It's not the read-only dbisql connection that is applying the HA stuff coming from the primary, AFAIK it is done by background recovery connections... so, maybe the read-only dbisql connection gets blocked, fine... but why doesn't it just wait?
If SQL Anywhere wants to fix a deadlock, fine, do it in the background, don't KILL MY REPORT THAT'S DUE IN SUPERIOR COURT TOMORROW MORNING! (ok, exaggerating for effect, but...)
It's starting to look like a mirror/copy database is not a reliable place to run reports... just sayin' 🙂
Read-only connections might get blocked, but they don't acquire locks
What do you mean by read-only connections? Those that just query but don't modify data? - If so, I don't think this is true. Depending on isolation level, they might acquire several kinds of locks (short-term read locks, long-term read locks, phantom locks).
Even at isolation level 0, schema locks are acquired (what would at least conflict with the exclusive lock needed for a TRUNCATE TABLE...), and those are held until the transaction ends. (Therefore I asked for the transaction boundaries in your sample.)
Volker is correct. Read only connections can and do acquire locks on a mirror or copy node (they must). They will always acquire at least shared schema locks when they reference any table.
Note that there doesn't need to be a deadlock to get the "Rollback occurred due to deadlock during prefetch" error. Operations that were committed to the primary need to be applied to the mirror and copy node, and if attempting to apply these operations block for a few seconds (not necessarily deadlock), you'll see the error. Unfortunately the error message isn't very accurate in this case.
While the mirror and copy node may be read-only to user connections, it is a read-write database and the operations from the primary need to be applied on a timely basis. Operations not getting applied for hours after they were committed on the primary would be very poor behavior. Requesting a higher isolation level on the mirror or copy node and getting inconsistent results would also be wrong. So locks must be respected, and to avoid applying operations blocking indefinitely, user connections can get rolled back.
@Ian: I accidentally deleted my earlier comment praising you, so here's Tiffany Doggett to replace it :)...
User | Count |
---|---|
69 | |
9 | |
8 | |
7 | |
7 | |
6 | |
5 | |
4 | |
4 | |
4 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.