cancel
Showing results for 
Search instead for 
Did you mean: 

Interesting performance when index is turned on

Breck_Carter
Participant
1,306

(The following conversation was posted on the NNTP server in 2008, and is worth reposting here... the original question and two replies have been rearranged in chronological order for ease of reading.)

 
Date: Fri Oct 17 16:26:58 2008
Subject: Re: Interesting performance when index is turned on

> Julian B. wrote:
>> Hi all,
>>
>> I'm evaluating. Though I need an embedded DB, I'm currently
>> running some benchmarks on a server (not embedded in my
>> app).
>>
>> QUESTION:
>> Anyway, when I index over an attribute results are showing
>> that the writing of the data is actually FASTER than when I
>> don't index. Equally interesting, reading the data is
>> SLOWER when I've got an index and FASTER when there's no
>> index. How can this be?
>>
>> The only thing I can think of is if SQL Anywhere's native
>> indexing scheme doesn't help me on ranged selects for
>> timestamps (i've tried both clustered and non clustered).
>>
>> Does anyone have any insight on the situation?
>>
>> SETUP:
>> I write 10M records and then read 1000 of the most recent
>> ones. I do this with indexing off, then I drop the table
>> and then recreate the table and create the index and do the
>> experiment over. Again, reads are faster w/o indexing and
>> writes are faster with indexing - both are not what I would
>> expect.
>>
>> MY QUERIES:
>> table creation: "CREATE table TABLE_NAME(aaa SMALLINT, bbb
>> BIGINT, ccc SMALLINT, timeStamp DOUBLE, eee CHAR(15));"
>>
>> index creation: "CREATE CLUSTERED INDEX IDX_TIME on
>> CAN_MESSAGES (timeStamp ASC);"
>>
>> selects: "SELECT ALL TOP 1000 timeStamp from TABLE_NAME" +
>> " WHERE aaa= " + aaa +
>> " AND bbb= " + bbb +
>> " AND timeStamp >= " + (writeEnd -
>> time_range) +
>> " AND timeStamp <= " + writeEnd + " ORDER BY
>> timeStamp DESC"
>>
>> Any help would be greatly appreciated.
>>
>> Thanks,
>> Julian

"Glenn Paulley [Sybase iAnywhere]" <paulley@ianywhere.com> wrote in message 
news:48f893d6$1@forums-1-dub...
> 1) Why are you storing timestamps as DOUBLEs?
>
> 2) Did you reorganize the table after creating the clustered index?
>
> 3) Does the access plan for the SELECT query actually utilize the 
> clustered index after it is created?
>
> 4) Does the table have a primary key to optimize writes to the transaction 
> log?
>
> 5) Are you testing systematically with both a cold and hot buffer pool?
>
> Glenn
>
> Glenn Paulley
> Director, Engineering (Query Processing)
> Sybase iAnywhere

Also,

In addition to Glenn's excellent points, I have some comments.

Be sure when comparing inserts that the database file is in a similar state. 
The simplest way is to copy the dataase before the insert and restart from a 
clean copy for each insert. You should ensure that the database file is 
pre-grown (alter database add) to hold the newly added rows. The instest 
tool can be used to test insert performance.

How selective are your predicates, and how much of the selectivity is 
provided by the timestamp column? You have a TOP 1000 as well as the search 
conditions. How many rows are returned without the TOP 1000?

If a significant amount of the selectivity is provided by the columns aaa 
and bbb, you may prefer an index on (aaa,bbb,timeStamp); however, that might 
affect your clustering decision.

The special "sort top N" operator is used for top values of 1000 or lower. 
If SA uses a sequential scan, it can satisfy the ORDER BY clause by 
retaining the 1000 smallest values using a priority queue structure. If the 
TOP value is higher than 1000, a sequential scan would require a full sort 
before applying the rowlimit.

You may wish to consider Optimization_goal to choose whether the optimizer 
biases towards returning the first row quickly or all rows quickly.

It is rather expensive to scan an index in reverse compared to going 
forward; you may wish to create an index that is ordered DESC on timeStamp. 
There may be an issue that prevents the optimizer from costing an ordered 
scan of such an index properly (depending on your server version). During 
benchmarking you may wish to force the index to be used to compare execution 
plans.

SELECT ...
FROM CAN_MESSAGES with(index(IDX_TIME ))

-- 
Ivan T. Bowman
SQL Anywhere Research and Development
justin_willey
Participant
0 Kudos

This is a good point to note: It is rather expensive to scan an index in reverse compared to going forward....

Accepted Solutions (0)

Answers (0)