on 2013 May 17 9:11 AM
Hello,
We are running a 3rd party application for many years that had some performance problems.
Some queries do not use the correct index.
An example :
- table A column ref1 varchar(35) null. non unique index on this column.
select * from A where ref1 = '00000000123'
uses the index.
create table B (ref1 varchar(35) null)
insert B values ( '00000000123')
select * from A, B where A.ref1 = B.ref1
does a table scan on A :
|ROOT:EMIT Operator (VA = 3)
|
| |NESTED LOOP JOIN Operator (VA = 2) (Join Type: Inner Join)
| |
| | |SCAN Operator (VA = 0)
| | | FROM TABLE
| | | B
| | | Table Scan.
| | | Forward Scan.
| | | Positioning at start of table.
| | | Using I/O Size 4 Kbytes for data pages.
| | | With LRU Buffer Replacement Strategy for data pages.
| |
| | |SCAN Operator (VA = 1)
| | | FROM TABLE
| | | A
| | | Table Scan.
| | | Forward Scan.
| | | Positioning at start of table.
| | | Using I/O Size 8 Kbytes for data pages.
| | | With MRU Buffer Replacement Strategy for data pages.
There are no NULL values in the column, but most of the values are " ", as shown by the histogram in the optdiag result :
1 0x00000000 < 0x20
# 1 0.00000000 < " "
2 0xf8ee213f = 0x20
# 2 0.63255262 = " "
3 0x819da33c <= 0x303039313439323132373738
# 3 0.01997256 <= "009149212778"
4 0x3837ad3c <= 0x303130313032313733383231
# 4 0.02114449 <= "010102173821"
5 0x1dafa53c <= 0x303130313233313834363237
# 5 0.02022510 <= "010123184627"
.... etc
What I used to do in 12.5 and before :
- create a table X with column ref1 varchar(35),
- select non-blank values of A.ref1 into X and create an index on the column ref1
- run optdiag on A and X
- replace the statistics of A.ref1 by those of X.ref1 in the optdiag file of A
- reload the optdiag file into the database.
I did this for about 10 columns in the database.
I had hope that the optimizer in 15.7 would better understand these queries, but it doesn't.
Now I wonder if there isn't a more elegant way to make these queries choosing the right index.
Any ideas ?
Thanks,
Luc.
Hello Simon and Mark,
Thanks for your suggestion. sp_modifystats seems to help and solves the problem.
Thanks,
Luc.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi Luc,
It does sound like a typical data skew problem.
You have 63% of the column = " ", this skews the total density so we reckon that for a single row from the left a <large> number of rows will be returned from the inner table.
You could argue that the positioned lookup might still be quicker, but my guess would be the data row\page cluster ratio for the index is fairly low. Not a fragmentation issue though, as the drcr for this index is a measure of how correlated the index is to the data with respect to access via that index. I.e. for each row it needs to fetch is it likely the next row resides on the same page etc. You are doing a select * so it needs data level access, assuming your small example is representative.
On 125x you should have been able to work around the data skew by running this:-
sp_modifystats 'A', 'ref1', 'REMOVE_SKEW_FROM_DENSITY'
This will replace the total density with the range cell density (which will be much, much more selective).
It will be lost when you next run update stats though so you would need to add it to the stats scripts.
On 15.7 we have the ability to perform a histogram merge between the attributes on either side of the join, we can only do this when we have histograms in place though. Without them it should fall back on the total density so the above sp_modifystats should do the trick.
The other alternative is to generate a histogram for that single row in B so we can then generate a join histogram - update statistics B (ref1)
HTH,
Simon
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
User | Count |
---|---|
66 | |
10 | |
10 | |
10 | |
10 | |
8 | |
6 | |
5 | |
4 | |
4 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.