on 2014 Mar 06 4:36 AM
I know that in the past the recommendation for SQLA was, that backups should be created using the internal mechanisms. Anyway I am wondering if with the arrival of new storage technologies this rule might have changed.
So my question is, would it be possible and if yes, what disadvantages would I encounter if the underlying storage system will perform snapshot copies of the database every now and then. Will these snapshots be usable and if yes, how much data might I loose.
Request clarification before answering.
Here is my understanding of things in a nutshell:
If the underlying OS/file system supports a 'point-in-time' type of snapshot of the filesystem (ie. they guarantee that the snapshot backup contains the copies of all files as of the exact same point in time), then they should work fine with SQL Anywhere. The only caveat is that if you recover from the snapshot, your require both the database and the log file because the database file will almost certainly have to go through recovery when you start it up.
For example, the Windows VSS service will work with SQLA even if the SQLA VSS service is not running. The SQLA VSS service attempts to provide a 'clean' copy of the database file for the snapshot, so that the database file doesn't need to go through recovery when it is restarted from the backup. If the SQLA VSS service isn't running, or takes to long to clean up, it simply means that if you recover from the snapshot, you will need both the database and the log file and the database will have to go through recovery before it comes back online.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Does a backup technique that relies on snapshots requires a full backup (matching backup copies of the database and log) every time? I.e. a snapshot of just the log is useless.
Is a full database-and-log backup created via snapshot incompatible for recovery purposes with subsequent log-only backups taken via dbbackup?
Are snapshot backups just plain incompatible with dbbackup backups? I.e., if you're in for a penny you're in for a pound.
If the SQLA VSS service is not running, or it was unable to provide a "clean" copy, when a VSS snapshot backup was taken, is it possible that lots and lots and lots of data will be lost because (a) the engine hadn't finished writing data to the log for recent COMMIT operations and (b) the engine hadn't done (or finished) a CHECKPOINT?
It's one thing to say "recovery does an pretty good job of not losing data when rolling forward after a crash", it's a question of "how trustworthy is a snapshot backup at actually containing a snapshot?"... the snapshot taken by a dbbackup is flawless as a snapshot, is the same true of VSS?
To put it another way: A backup is not the same as a crash. One expects a certain amount of data loss after a crash, but one does not expect the same to be true of a backup.
I would say that the best practice would be to get both the db and log every time you take a snapshot.
dbbackup and snapshot backups are not compatible.
If a transaction was successfully committed (ie. the statement completed and returned to the client), then the committed data is there in the database file and/or the log file, regardless of anything that happens after that. The only thing that would (should) cause data loss after a successful commit is a disk failure.
If a commit has not yet completed when the snapshot occurs, and you need to recover from that snapshot, then you will lose that uncommitted transaction, the same as if the server had crashed, or if you had taken a backup of the log file at that point in time.
I agree, a backup is not the same as taking a snapshot copy of the database and log. It is, however, better than not taking a backup at all.
Thanks!
For the record, here is worthwhile article on the subject: The Basics of the Volume Shadow Copy Service (VSS)
That would be a very valuable statement to add to the VSS doc page, methinks...
I would also agree that it is best practice to get both the db and log every time; however, backups of all kinds are compatible. The only thing a database cares about for recovery (or manually applying logs) is essentially that the log must contain the log offset as of the previous checkpoint of the database file.
Note that taking a snapshot copy of a database & log is semantically the same as using the backup utility except as follows:
a snapshot backup creates backup images of all of the dbspaces and the transaction as of a single instant in time (ie, the start of the backup).
the backup utility essentially creates a backup image up the dbspaces as of the time the backup started plus a backup image of the transaction log as of the time the backup ended.
Suppose, for example, backing up an enormous database takes 1hr for either method and you start it at 1am. With snapshot, the backup contains all transactions committed as of 1am. With the backup utility, the backup completed at 2am will have all transactions committed as of 2am (ie, the copy of the log in the backup will contain any operations applied between 1am and 2am).
In most cases, that difference doesn't matter -- it is usually just desirable to have one backup per day (for example) plus any logs created afterwards.
There is no way to truncate/rename the transaction log when using VSS. You might consider backing up just the database file with VSS, then rename the transaction log then backup the renamed log.
@John: So Jason's statement "dbbackup and snapshot backups are not compatible." is not true?
Honestly, I share Breck's point of view that the usability of the backup files (and the possible pitfalls) should be as clear as possible, including the documentation, and that's still not the case in that respect, obviously...
"The only thing a database cares about for recovery (or manually applying logs) is essentially that the log must contain the log offset as of the previous checkpoint of the database file."
That's a powerful, interesting and possibly valuable statement! I use the word "possibly" because you used the word "essentially"... words like those allow for an infinite variety of failure 🙂
At the moment, I trust dbbackup with my life. Can I trust VSS?
"You might consider backing up just the database file with VSS, then rename the transaction log then backup the renamed log."
I'm having a hard time getting my head around that statement. Cynical Breck says "I consider a lot of things, like hang gliding", then Paranoid Breck asks "If something goes wrong, will I be able to recover?" 🙂
Seriously, folks are sensitive about the safety of backup and recovery, especially recovery, so please excuse our nit-picking.
Now I guess us customers would really like need a clear point of view from you, dear company - you are invited to form a statement about the compatibility of normal and snapshot backups internally and publish it here (or elsewhere).
I do understand that there are differences between what should work technically and what ought to be recommended to work generally. However, I think that a backup method should be as reliable as possible, and I'm ready to follow the official guidelines. - If, by chance, I would not follow them, and would then get to know (in a recovery situation) that two backup files are still usable for recovery although that is not officially "granted", well, that's a nice-to-have... Nevertheless, we would need a method that is designed to work generally.
This should be a comment because that's where rhetorical questions go, but I wanted to share a video...
Let's talk about a specific "snapshot" implementation: Windows Shadow Copy, also known as Volume Snapshot Service (VSS), and in particular, that version of VSS delivered with all editions of Windows 7: It allows "any file to be retrieved as it existed at the time any of the snapshots was made".
Presumably, that means snapshots of the SQL Anywhere database.db and database.log are made at the same instant in time, and (again presumably) the snapshot is perfect as far as the bytes on disk that Windows knows about are concerned.
The question arises: From a freshly-started SQL Anywhere engine's perspective, during restore/recovery, are "the bytes on disk that Windows knew about" sufficient to launch a new executing instance of the database?
What about all those "dirty" and "cleanable" pages that were in the RAM cache and nowhere else, at the time the snapshot was taken? (with the difference between "dirty" and "cleanable" discussed recently).
FWIW SQL Anywhere dbbackup.exe takes a checkpoint before the backup starts. Plus, if the database is large enough, even snapshot copies of both files at the same time, every time, might be too slow, and multiple incremental log backups in between full copies of the database.db might be necessary... how does that fit with shadow copy behavior?
That's as far as I can go down the rabbit hole, I had to give back the red key :)...
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
... and then there's this: SQL Anywhere Volume Shadow Copy Service (VSS)
To cite the docs:
VSS issues a freeze command to checkpoint and then suspend all activity on all databases on all database servers. Each SQL Anywhere database server waits a maximum of 60 seconds for all databases to suspend all activity. Typically, this process takes a few seconds.
That would make me think in case the checkpoint(s) can be executed fast enough, the copied files should be as good as a normal backup for recovery (however without the normal "goodies" of already applying the checkpoint log and the like). But that's just my understanding, surely not a reliable source here:)
Reading that doc page is like reading a HIPAA legal agreement... when you're done, you're tired, you're head is buzzing, and you feel less-well-informed than when you began.... which means you STILL have no idea what you're getting into 🙂
User | Count |
---|---|
76 | |
30 | |
8 | |
8 | |
7 | |
7 | |
6 | |
5 | |
4 | |
4 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.