on 2013 Jul 11 11:00 AM
In our product, we are running a database with the SQL Anywhere 12.0.1.3895. This database receives changes from another database through replication using SQL Remote every 30 seconds. So the data is more or less permanently changing . Another process deletes data from the database which is detected to be ‘old’. Therefore the database is not increasing in size. The size of the database under test is about 20 GB.
In our test environment we permanently execute various queries against the database. After several days, we found, that the queries are getting slower and slower. The same queries that took a few seconds at the beginning, take several minutes or even up to hours!
After we are performing a create statistics
command on all our tables, the performance is back again. A few seconds for most of those queries. After several days the same behavior happens again.
Even a drop statistics
command on all our tables brings the performance back again.
What’s wrong? We assumed that the statistics are maintained by the engine itself. Just because this is described as a new feature for the Version 12 of SQL Anywhere: self-healing statistics management (http://download.sybase.com/presentation/TW2011/DATSQL1.pdf)
How can we keep the performance long term? Are there any special settings we need to configure? There were similar threads in this forum, but without any final solution:
http://sqlanywhere-forum.sap.com/questions/13709/low-performance-with-wrong-statistics and
http://sqlanywhere-forum.sap.com/questions/14029/query-plan-changes-radically
We have saved a database with such corrupt statistics (slow queries). So we can provide query plans, dumps of SYSCOLSTATS (taken after sa_flush_statistics
) or histograms (taken with sa_get_histogram
), before and after creating/dropping the statistics.
But the main question is: How do the statistics become corrupt?
Thanks for any ideas.
Request clarification before answering.
Have you thought about fiddling around with the following options:
collect_statistics_on_dml_updates option
Controls the gathering of statistics during the execution of data-altering DML statements such as INSERT, DELETE, and UPDATE.
update_statistics option
Controls whether connections can send query feedback to the statistics governor.
As to the docs, both options are ON by default and should only be set to OFF in particular cases, and please note, personally, I'm not at all pretending to understand anything about this documented difference:
The difference between the collect_statistics_on_dml_updates option and the update_statistics option is that the update_statistics option compares the actual number of rows that satisfy a predicate with the number of rows that are estimated to satisfy the predicate, and then updates the estimates. The collect_statistics_on_dml_updates option modifies the column statistics based on the values of the specific rows that are inserted, updated, or deleted.
However, this might be a chance to prevent the system from undesired statistic updates and might be helpful if you find out that your SQL Remote generated "DML flood" (which will consist of many, many single-row DML operations, AFAIK) may be inappropriate as basis for collecting stats.
AFAIK, both options can be set for individual users and/or connections - to allow further fine-tuning.
Note: I'm much more wild-guessing than Breck claims to be...
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
User | Count |
---|---|
61 | |
8 | |
7 | |
6 | |
6 | |
5 | |
4 | |
4 | |
4 | |
3 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.