on 2021 Feb 10 3:53 PM
We noticed during some testing that SQL Remote will send data that hasn't been checkpointed (not sure that's a word).
We did a disaster recovery senecio where we forced a crash of a database involved in replication during the application of replication messages. We recovered the database by unloading and reloading then setting the offsets appropriately. At this point the offsets were set to the end of the log file that was in service when the disaster occurred. There were many transactions in that log file which were not applied to the crashed database since it was not a clean shutdown.
When we ran SQL Remote in sent the transactions that were not applied to the remotes. The end result of course was the remote had data the consolidated did not have which of course is a problem.
How do you recover from this? I've asked a related question regarding applying log files to a rebuilt database.
Thanks
Jim
Request clarification before answering.
Well, IMHO if you decide to
it seems quite clear that there's a discrepancy between the data published by SQL Remote (based on the offline log) and the database contents itself, that the database itself can not notice...
FWIW, even using "DBREMOTE -u" (i.e. only using transactions from offline logs) would not prevent that situation in my understanding while using your steps. However, in that case you might have gone back to the last backup and then use the current log to bring the database up-to-date (unless that would fail again due to the file size limitations). With v17, you might use Point-in-time recovery here.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Thanks for your comments.
You go that right the -u and the last backup most likely would have solved this issue.
In this case we would have started with the last backup applied all files BEFORE the crash, unloaded and reloaded set the offsets to the log file before the crash and place in operation. Theoretically this would have caused all remotes to resend because no fresh/new replication messages would apply.
We have changed our dbremote process to incremental backup, receive, incremental backup send, incremental backup with the -u option in dbremote. Important to note our consolidated does nothing but support replication including a couple of post receive hooks to process some incoming data.
I do recommend that SAP consider a couple of changes
1) Don't grow a DB Space beyond it's limits 2) Don't allow DBRemote to process data that has not been through a Checkpoint
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
As stated here and on your related older question, I still guess a GrowDb event handler could already help to prevent a database halt once a critical size limit is hit. Would that not help here?
Note, if the engine's behaviour can be improved here, that's fine - but I would not expect changes for version 16, so I don't think you should wait on that...
User | Count |
---|---|
62 | |
7 | |
7 | |
6 | |
6 | |
5 | |
5 | |
4 | |
4 | |
4 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.