on 2015 Jun 26 6:29 AM
SAP SQL Anywhere 16.0.0.2043 DB Auditing is OFF mode=synchronous
I am using High Availability as follows:- partner db on local, partner db and arbitor on remote.
local started first runs as primary remote mirrors
I have all three servers talking and all is well for a while (A few hours at most). Within a few hours the local partner (primary) disconnects and the error message "logs do not match" is logged on restart (see below).
Copying the .db and .log from remote to local means a restart will work (until the next mismatch).
Any ideas
TIA
Ephraim
Below is the log from the server
I. 06/26 10:13:43. TCPIP link started successfully I. 06/26 10:13:43. Now accepting requests I. 06/26 10:13:43. Transaction log: ks.log... I. 06/26 10:13:43. Database "ks" transaction log starts at offset 177760696 I. 06/26 10:13:43. Database "ks" mirroring: local status for this server: role=mirror, state=synchronizing, sequence=45, yielding=N I. 06/26 10:13:43. Database "ks" mirroring: determining mirror role ... I. 06/26 10:13:44. Database "ks" mirroring: connected to partner "mirror_ks2" I. 06/26 10:13:44. Database "ks" mirroring: partner status for this server: role=mirror, state=synchronizing, sequence=46, yielding=N I. 06/26 10:13:44. Database "ks" mirroring: connected to arbiter "ks_arbiter" I. 06/26 10:13:44. Database "ks" mirroring: arbiter status for this server: role=mirror, state=synchronizing, sequence=46, yielding=N I. 06/26 10:13:44. Database "ks" mirroring: synchronizing ... I. 06/26 10:13:44. Database "ks" transaction log current offset is 181865872 I. 06/26 10:13:44. Database "ks" (ks.db) started as mirror at Fri Jun 26 2015 10:13 I. 06/26 10:13:44. Database "ks" mirroring: failed getting log pages E. 06/26 10:13:44. Database "ks" mirroring: database is not compatible with primary; files must be replaced E. 06/26 10:13:45. Transaction log does not match log on mirror partner E. 06/26 10:13:45. Transaction log does not match log on mirror partner E. 06/26 10:13:45. Transaction log does not match log on mirror partner I. 06/26 10:13:45. Database "ks" (ks.db) stopped at Fri Jun 26 2015 10:13 I. 06/26 10:13:45. Database "ks" mirroring: disconnected from partner "mirror_ks2" I. 06/26 10:13:45. Database "ks" mirroring: disconnected from arbiter "ks_arbiter"
The mismatch is being caused by this checkpoint:
I. 06/29 08:59:11. Starting checkpoint of "ks" (ks.db) at Mon Jun 29 2015 08:59
When you suspend the server like this, and then wake it up, it takes a moment for the connections to notice they've been severed. However, the queued checkpoint kicks in immediately. Thus the server is doing a checkpoint before it realizes it has lost quorum, and so it has a checkpoint that the mirror doesn't. This is an unrecoverable log mismatch.
You can work around this problem by shutting down the server before suspending the machine, using a "ALTER DATABASE SET PARTNER FAILOVER" to force a failover, or moving the primary to a machine that isn't suspended regularly.
Unfortunately, the setup you're using isn't recommended for high availability. You might want to consider using a replication or synchronization solution, such as Mobilink.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
I am uncertain if this will be changed in the future. It is odd that an HA system would contain a server which suspends (or hibernates). The HA system is meant for "High Availability" after all and IMHO suspending one of the computers goes directly against the idea of maintaining the system up 24x7.
We will add to the backlog a task to investigate a solution. Until such time that a solution is implemented (and if one is ever implemented), a work-around (manual fail-over before suspending) has been provided, and we can add a note to the documentation.
It is odd that an HA system would contain a server which suspends (or hibernates).
Yes, I certainly agree on that point and your further arguments. It's just that I was not sure whether that "checkpoint problem" could also happen when a server that acts as primary goes down unexpectedly (say, the server is shutdown accidentally or the like) - then it would be bad if a log mismatch would happen.
This checkpoint problem cannot happen if the server goes down due to machine shutdown, process kill, or server shutdown. It is unique to the process being suspended then resumed. Or at least it should be :).
If the local machine Windows 7 "goes to sleep" as well as the Router/Network - could this cause a log file mismatch?
When I stopped working at 10:20 approx. last night I was still connected to the local primary. I see a switchover last night after working the whole day on the system without an issue. Logging in this morning at 08:59 I am connected to the remote and the local server cannot start requests a new copy of the files Surely a High Availability system must be able to recover from such an outage!!
The local log (annotated) is attached below
S1 log (local)
S1 log (local)
I. 06/28 22:13:30. 117
I. 06/28 22:13:32. BPT
I. 06/28 22:13:32. 648.0000
I. 06/28 22:13:44. BPT
I. 06/28 22:13:44. 648.0000
--------------------------------->> end of days work local primary
Local/Router power down over night so a switchover make sense
---------------------------------->>wake up thr machine
I. 06/29 08:59:11. Starting checkpoint of "ks" (ks.db) at Mon Jun 29 2015 08:59
I. 06/29 08:59:16. Database "ks" mirroring: disconnected from arbiter "ks_arbiter"
I. 06/29 08:59:32. Database "ks" mirroring: disconnected from partner "mirror_ks2"
E. 06/29 08:59:32. Database "ks" mirroring: mirror connection to server "ks_arbiter" failed
----------------------------------------->.event handler tries to notify me of switchover
I. 06/29 08:59:32. dbo.xp_startsmtp() failed: (return code = 102)
I. 06/29 08:59:32. dbo.xp_startsmtp() failed: (return code = 102)
I. 06/29 08:59:32. dbo.xp_startsmtp() failed: (return code = 102)
E. 06/29 08:59:32. Database "ks" mirroring: mirror connection to server "ks_arbiter" failed
I. 06/29 08:59:32. Database "ks" mirroring: restarting to determine mirror role
I. 06/29 08:59:33. Finished checkpoint of "ks" (ks.db) at Mon Jun 29 2015 08:59
I. 06/29 08:59:33. Database "ks" (ks.db) stopped at Mon Jun 29 2015 08:59
I. 06/29 08:59:33. Starting database "ks" (c:\\server1\\ks.db) at Mon Jun 29 2015 08:59
I. 06/29 08:59:33. Performance warning: Database file "c:\\server1\\ks.db" consists of 3 disk fragments
I. 06/29 08:59:33. Database recovery in progress
I. 06/29 08:59:33. Last checkpoint at Mon Jun 29 2015 08:59
I. 06/29 08:59:33. Database "ks" transaction log offset after last checkpoint is 182175412
I. 06/29 08:59:33. Checkpoint log...
I. 06/29 08:59:34. Transaction log: ks.log...
I. 06/29 08:59:34. Database "ks" transaction log starts at offset 177760696
I. 06/29 08:59:34. Database "ks" mirroring: local status for this server: role=primary, state=synchronized, sequence=6, yielding=N
I. 06/29 08:59:34. Database "ks" mirroring: determining mirror role ...
E. 06/29 08:59:34. Database "ks" mirroring: mirror connection to server "mirror_ks2" failed
E. 06/29 08:59:34. Database "ks" mirroring: mirror connection to server "ks_arbiter" failed
E. 06/29 08:59:34. Database "ks" mirroring: mirror connection to server "mirror_ks2" failed
E. 06/29 08:59:34. Database "ks" mirroring: mirror connection to server "ks_arbiter" failed
E. 06/29 08:59:34. Database "ks" mirroring: mirror connection to server "mirror_ks2" failed
E. 06/29 08:59:34. Database "ks" mirroring: mirror connection to server "ks_arbiter" failed
E. 06/29 08:59:35. Database "ks" mirroring: mirror connection to server "mirror_ks2" failed
E. 06/29 08:59:35. Database "ks" mirroring: mirror connection to server "ks_arbiter" failed
E. 06/29 08:59:35. Database "ks" mirroring: mirror connection to server "mirror_ks2" failed
E. 06/29 08:59:35. Database "ks" mirroring: mirror connection to server "ks_arbiter" failed
E. 06/29 08:59:35. Database "ks" mirroring: mirror connection to server "mirror_ks2" failed
E. 06/29 08:59:35. Database "ks" mirroring: mirror connection to server "ks_arbiter" failed
E. 06/29 08:59:36. Database "ks" mirroring: mirror connection to server "mirror_ks2" failed
E. 06/29 08:59:36. Database "ks" mirroring: mirror connection to server "ks_arbiter" failed
E. 06/29 08:59:36. Database "ks" mirroring: mirror connection to server "mirror_ks2" failed
E. 06/29 08:59:36. Database "ks" mirroring: mirror connection to server "ks_arbiter" failed
E. 06/29 08:59:36. Database "ks" mirroring: mirror connection to server "mirror_ks2" failed
E. 06/29 08:59:36. Database "ks" mirroring: mirror connection to server "ks_arbiter" failed
E. 06/29 08:59:37. Database "ks" mirroring: mirror connection to server "mirror_ks2" failed
E. 06/29 08:59:37. Database "ks" mirroring: mirror connection to server "ks_arbiter" failed
E. 06/29 08:59:37. Database "ks" mirroring: mirror connection to server "mirror_ks2" failed
E. 06/29 08:59:37. Database "ks" mirroring: mirror connection to server "ks_arbiter" failed
I. 06/29 08:59:38. Database "ks" mirroring: connected to partner "mirror_ks2"
I. 06/29 08:59:38. Database "ks" mirroring: partner status for this server: role=mirror, state=synchronizing, sequence=7, yielding=N
I. 06/29 08:59:38. Database "ks" mirroring: connected to arbiter "ks_arbiter"
I. 06/29 08:59:38. Database "ks" mirroring: arbiter status for this server: role=mirror, state=synchronizing, sequence=7, yielding=N
I. 06/29 08:59:38. Database "ks" mirroring: synchronizing ...
I. 06/29 08:59:38. Database "ks" transaction log current offset is 182175412
I. 06/29 08:59:38. Database "ks" (ks.db) started as mirror at Mon Jun 29 2015 08:59
I. 06/29 08:59:38. Database "ks" mirroring: failed getting log pages
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
E. 06/29 08:59:38. Database "ks" mirroring: database is not compatible with primary; files must be replaced
E. 06/29 08:59:39. Transaction log does not match log on mirror partner
E. 06/29 08:59:39. Transaction log does not match log on mirror partner
E. 06/29 08:59:39. Transaction log does not match log on mirror partner
I. 06/29 08:59:39. Database "ks" (ks.db) stopped at Mon Jun 29 2015 08:59
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
I. 06/29 08:59:39. Database "ks" mirroring: disconnected from partner "mirror_ks2"
I. 06/29 08:59:39. Database "ks" mirroring: disconnected from arbiter "ks_arbiter"
I. 06/29 08:59:39. Database server shutdown due to incompatible files for database mirroring
W. 06/29 08:59:42. TCPIP listener on IP address (::):7771 is exiting
W. 06/29 08:59:42. TCPIP listener on IP address 0.0.0.0:7771 is exiting
I. 06/29 08:59:42. Database server stopped at Mon Jun 29 2015 08:59
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Just to understand:
If the local server should act as preferred primary server but will go down regularly (during a planned power-down at night), then wouldn't it be appropriate to force a FAILOVER (via "ALTER DATABASE SET PARTNER FAILOVER") and then to stop the local database server (via the dbstop utility) just before the power-down? (Note: I don't claim that this should be necessary for a successful switchover, it would just be clearer IMHO if the power-down is a planned one...).
Does the console log of the local server show that the server has been stopped deterministically (i.e. had a clean shutdown)?
As others have already asked: What do the console logs of the partner and arbiter server show around the time the local server goes down (they should state that a switchover has taken place)? And what do they show around the time the local server tries to re-connect?
I process during the day (to the local primary) and use the remote (mirror) for web access, so the web db gets updated automatically (it behaves as a backup db as well). The web server is on a vps (lower performance machine and is not yet live) and making it the primary has performance implications (my next issue once this one is resolved is a degradation in performance one). I was not aware of the FAILOVER command which I will try tonight. Console logs for Arbiter/Remote indicate a normal switchover in the evening and an attempted reconnect by the local in the morning (which fails due to log mismatch).
> it behaves as a backup db as well
A High Availability secondary server should not be relied upon to act as a backup. This is discussed in several Help topics; e.g. Regular back ups are still required.
User | Count |
---|---|
60 | |
10 | |
7 | |
7 | |
6 | |
6 | |
5 | |
5 | |
5 | |
5 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.