cancel
Showing results for 
Search instead for 
Did you mean: 

User estimates - the least estimate to force scanning from desired table

10,265

We have a complex query with OPTION(user_estimates='on') at the end. And we have two estimates in that query with 0 percent. It worked perfectly for a few months (although the docs say not to use 0 percent) but today suddenly the optimizer started to choose wrong execution plan (i. e. ignored these estimates). Then I changed these estimates to 0.00001 percent, and everything works perfectly again. So the question is: what is the least value for user estimates that is safe to use in terms of forcing optimizer to start scanning from concrete index? How much differently does the optimizer interpret 0 and 0.00001 (or even less) percent? Thanks.

Version: 11.0.1.2913

VolkerBarth
Contributor
0 Kudos

If you want to force a table-scan, wouldn't it be clearer to use the WITH (index-hint) clause, say something like:

... FROM MyTable WITH (NO INDEX)


EDIT: And if you like to force the usage of a particular index (instead of a table scan), an according "WITH (INDEX MyTable.MyIndex)" clause might help, as well.

My goal is to force SQL Anywhere engine to start scanning from concrete index in complex query. I know about WITH(index-hint) but it did not perform well in earlier similar situations and this time is not exception - it gives different (undesired) and unstable plan. So I prefer using user estimates instead in such situations (but only as the last option).

Besides, returning to my question, I tried to get execution plan with optimization_level option set to 15 before changing estimate from 0 to 0.00001, and it gave me the plan I was expecting - the same as after changing 0 to 0.00001 percent with optimization_level set to 9 (default).

Breck_Carter
Participant
0 Kudos

> optimization_level option set to 15

That's a valuable tip... an application often contains a few Queries From Hell among hundreds of other queries with acceptable performance. For the slow queries, it doesn't matter how long it takes for SQL Anywhere to build the plan, what matters is getting the fastest plan possible.

Did you use OPTION ( optimization_level = 15 ) to do it?

Breck_Carter
Participant

> 0 and 0.00001

WAG: In theory the difference is huge; zero says there will probably be no rows at all that satisfy the predicate, and if there is a simple way to prove that there are no rows without actually having to find any such row, the optimizer might use that technique. If the technique fails ("sorry, there is at least one row") then the optimizer might use a slow technique to find those rows... remember, the estimate of 0 says that's OK because you don't think there will ever be any rows.

On the other hand, 0.00001 says a tiny number of rows probably exist, and if there's an index that helps this predicate it should be used.

Once again, this is JUST A GUESS at the possibility that the V11 optimizer does have any "simple way to prove that there are no rows without actually having to find any such row" other than use the same index used in the 0.00001 case.

Personally, if the Help says "don't do it", and when you did do it it didn't work out well, then I'd stop doing it 🙂

alt text

I think this is the right answer, I can't seem to get the optimizer to generate a 0% estimate for anything. Even if the statistics show no rows for a range, the estimate is always a very small number (that gets smaller as the table size grows, I've seen it as low as 1.0e-8). If you add a predicate that is never true (1=0), you still don't get a 0% estimate (instead you seem to get full table scans with no predicate and a filter that prevents them from actually running). I imagine we just don't accept them for the reasons Breck mentions.

0 Kudos

I did that only to get a plan compare with my desired plan. As I wrote above, the plan has become the same.

0 Kudos

if the Help says "don't do it", and when you did do it it didn't work out well, then I'd stop doing it

I did stop in this case and will fix a few other cases despite there are no negative symptoms yet. So I need to know what to change that zero to. Can you correct me if I misunderstand something? I must use the number (estimate) that is less than the value of such query:

select 100.0 / (select count() from the_biggest_table_used_in_the_query)
Am I right?

0 Kudos

It seems like we use (1 / number of rows in table at last checkpoint) as our estimate when we think there are no rows (based on stats), but this is based on experimentation. But this doesn't necessarily guarantee a index scan. Is there a reason the statistics are giving such bad results? Would you be able to post both plans (with and without estimate) along with the stats for the column? Making sure all business rules (uniqueness, foreign key relationships) are expressed in the schema as well.

Breck_Carter
Participant

> select 100.0 / (select count() from the_biggest_table_used_in_the_query)

Each selectivity estimate specifies the success rate of an individual predicate, not the query as a whole, as described in Explicit selectivity estimates... so, the "biggest table" is really moot.

The "100.0" seems to imply you are thinking in percentages, and user estimates are probabilities <= 1 rather than percentages.

Accepted Solutions (0)

Answers (0)