on 2011 Jun 27 8:46 AM
On one test system with 12.0.1.3356 on Win32, I cannot start the database after a regular machine shutdown. When starting the database server, it issues error SQLCODE -1006 (SQLE_FILE_BAD_DB, "Unable to start specified database: '%1' is not a valid database file"). Rather surprising, as the shutdown was a regular one.
Well, of course I have a backup, and the file has been validated successfully just before the backup was done (yep, I'm aware that validating the backup is even smarter...). Unfortunately, the backup file just issues the same problem.
FWIW: DBLOG doesn't work against the database file (it issues the same error), whereas the log can be translated successfully.
Are there any hints what might be the cause for the bad file? I do not have hints for disk problems, and the SQL Anywhere 12 installation seems o.k. - at least I can start other databases...
As said, it's a test system so there's no problem with lost data here, but I'd like to get to know what could have happened.
Sorry folks, I have not been very active on the forum lately and am therefore very late with my reply.
As Volker has already indicated in a different thread, the documentation states very clearly that a database should be backed up prior to upgrading and should be restarted immediately upon a successful upgrade. What the documentation does not state very clearly is that a failed upgrade WILL ALWAYS lead to an invalid database file. The reason is quite simple. At the start of the upgrade process, the server hammers the set of bytes in the database file that indicate whether or not this database is valid. These bytes are then restored at the end of the upgrade process provided the upgrade was successful. The whole idea is that if the upgrade is unsuccessful for whatever reason, then since we cannot be certain what state the database is in with respect to the partially updated catalog tables, procedures and options, we make it impossible for the user to continue using the database. Instead, we expect that the user will be able to revert to their backup copy.
Now, with respect to your failed upgrade Volker, I have reproduced the problem that you are seeing and have opened a support case to have the problem fixed.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Well, as I pointed out in the other question, a failed upgrade should primarily make it more clear that it had failed at all...
As stated, a second attempt to upgrade that database seemed to be successful as it gave no error message, and the expected SYSHISTORY upgrade entry was added. (Yes, I did not restart the database in-between. I should.)
Karim, I'm glad the issue is reproducable and is about to be fixed. Could you supply the CR number as soon as it is available?
Run a test to check for bad RAM.
Bad RAM can mimic, and then permanently cause, disk errors, so it is a huge deal. If you suspect bad RAM, and you care at all about the contents of your hard drives, shut your computer off immediately because ANYTHING you do can result in permanent damage.
Then get/build a bootable RAM checker like http://www.memtest86.com/
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
My old laptop had bad RAM and I didn't know it. For two years I used this machine, which sometimes worked fine and other times was unbearably slow. Eventually I found the Windows Memory Diagnostics utility (mdsched.exe) which found the problem. Got a new laptop, identical to the old one but with good RAM, and it's like a brand new machine.
Mark's idea is a good path to go down: if you type: "dbinit new.db" and compare the first 4096 bytes to the original database's 4096 bytes, are there any major areas on the original page that look like they might be missing? Opening a technical support case might be another good way to go, so that we can bring the file in and take a look.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Jeff, another good starting point, thanks:)
Comparing with a new db was not that helpful, as the invalid db states 12.0.1.3324 as its version number whereas new ones are created with 12.0.1.3356. Obviously, this version is written several times in the header page. However, I noted that the header page for strongly encrypted databases seems quite similar to those of not unencrypted ones. (Note, the db in question is strongly encrypted.)
So I compared the header page of a production database and the test database, and both name the same version (though the latter has been upgraded to 12.0.1.3356 recently).
Diff'ing both headers revealed several occurences of 0x3F in the failing db where the working db has 0xAD - typically in the "Sybase Inc., Copyright" strings. Besides that, both headers seem identical, except the last 4 bytes (0xB5B3EFD0 vs. 0x8362BF7A).
Could the upgrade lead to this problem? The failing start was the first database start after I had issued
ALTER DATABASE UPGRADE PROCEDURE ON
two days before...
Page 0 is not encrypted in encrypted DBs' headers (after all we still need to determine that the file is a database and that it is encrypted) so it is not surprising that you have seen much of a difference.
If you are seeing 0x3F or 0xAD bytes within the "Sybase Inc., Copyright..." area (i.e. trailing bytes) of page 0 then I question how did you view the page? ... because this area of the page should not have any such bytes. (at least in my testing I do not see any).
So, how did you extract/view/compare the bytes within page 0 of each database file?
I used VisualStudio's binary editor to look at the headers.
However, as the database file is to big to be view directly, I used a small SQL block (what else?!) to read the file in a long varbinary variable and write the first 4096 bytes with
unload select ... format text escapes off quotes off hexadecimal off.
Tomorrow I'm gonna check whether a direct opening of smaller databases will reveal the same bytes...
fc /b file1 file2 will show you bytes that have changed. The important parts are in the area prior to the copyright strings which are just there as informational filler. Usually, "not a database file" means that the signature is wrong. What are the bytes at offset 0x14? They should be 5E BA 7A DA (or little endian for "DA7ABA5E" (DATABASE)).
Winhex is a good tool to look at files and also at harddisks directly. If you use the evaluation version you can view but not change, anyway not too expensive...
Ah, WinDiff hasn't revealed differences in those leading bytes, but there are: The corrupt database has 75 BD 9A DA at offset 0x14.
Comparing a valid production db file with the corrupt one reveals 35 different bytes in the header page. I do not know if they should be identical, as the test db is an older "fork" of the production db.
I can present a WinHex diff report, if that's helpful for anyone interested...
Would it be of any worth to try to "fix" these differences and check whether the database is running again? (I guess the answer will be "no", as the chance will be very high that other pages are somewhat invalid, too...)
Ah, so we now have the answer: The signature that you see (75 BD 9A DA) indicates that an UPGRADE of the database has failed, so the database is in a "bad" (unknown) state and cannot be used. Is it possible that an upgrade was in progress when you shutdown the database server/computer?
No, but (as stated in my first comment on this answer) I did an upgrade when I had started the database, and I remember that the first run issued a warning (which I forget to write down). A further attempt seemed to work, and so I thought everything went fine. In syshistory, the UPGRADE from 12.0.1.3324 to 3356 was noted as expected.
(FWIW, I tried to study the new SQL Remote HTTP stuff...).
Then I used the database to do some tests with OLAP and spatial data (which led to further questions here...). Finally, the server was restarted, and after that the database became invalid...
So it seems the UPGRADE failed somewhat. I csn supply the translated log for that part, if that would help...
Not exactly - it's a test database, and the automated backup event has backed up the upgraded database as well (as it was validated successfully).
But I've just taken a copy of a production database and tried to upgrade that - and it returns the same error (translated from German):
"Error in line 1 - procedure 'sa_oledb_catalogs' not found."
In the dbisqlc message pane there are several errors shown:
Error in file sybprocs.sql in line 5379 with SQLCODE -110
...
Error in file oleschem.sql in line 64 with SQLCODE -265
Now, syshistory doesn't have an UPGRADE entry and ... after the database is shutdown, it won't start.
So I have a reproducable:)
FYI, there are many Hex editors. I like frhed.
Its hard to tell why/how your database file has become corrupted without looking at the file itself. Since you are getting the "... is not a valid database file" error message it would imply that page zero has become corrupted.
You may want to read the SQL Anywhere I/O Requirements for Windows and Linux whitepaper since a misconfigured computer can lead to this type of corruption.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Well, I could give the default "but I don't have changed anything recently" response...
It's a low-cost machine that has run SQL Anywhere test systems (v8-v12) for years on the same hardware with no issues at all, with a classic HD drive used with NTFS. CHKDSK /F doesn't reveal any errors, nor does the event viewer.
The one thing I do have changed is a second HD drive that has been added a few days ago but that has not been used by the database system. I would not expect that this has influence on the flushing/write through behaviour.
Besides that, the backup has been written to the local disk and then copied to a net share - both copies do show the error.
What else might I have to check?
By trying to upgrade from 12.0.1.3324 to 12.0.1.3356 - at least in my case:(
Cf. the bunch of comments on Jeff's answer...
FWIW, I've attached the translated log from the
ALTER DATABASE UPGRADE PROCEDURE ON
statement.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
FWIW, I tested again, now with Sybase Central's Upgrade wizard. That way I could only issue an
ALTER DATABASE UPGRADE JCONNECT OFF
statement, i.e. the PROCEDURE ON clause was not available.
Interestingly enough, this upgrade has worked, the database can be restarted, SYSHISTORY claims an 3356 upgrade, and the SQL Remote HTTP stuff seems to be added, too (at least there are these sr_xxx procedures and functions).
So I'm about to ask:
Is the note in the sqlremote1201.pdf doc file (p. 109) wrong that I have to apply the upgrade with PROCEDURE ON to get the desired functionality?
User | Count |
---|---|
71 | |
10 | |
8 | |
7 | |
7 | |
6 | |
6 | |
6 | |
6 | |
5 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.