on 2011 Apr 28 1:18 AM
We have a composite index of three columns (date, time, pk) where pk is the primary key.
The pk column has its own index so why include pk in the composite index?
Request clarification before answering.
Including PK in the index is unlikely to be of much use. To be used as a matching predicate to facilitate indexed retrieval, your query must contain a sargable predicate against the PK column. If the predicate is equality, eg PK = <value>, then the PK index alone will serve you well. If an inequality, eg PK > <value> or a range, eg. PK BETWEEN <value> AND <value> then the index may be useful, but of course only if the WHERE clause also contains strict equality conditions with the date and time values. In my experience, the latter (strict equality on date/time) is fairly rare, and the former (range comparisons on PK values) is also rare, particularly if the PK values are surrogate values.
I would expect, in the main, for applications to reference rows using PK equality conditions, or ranges of date/time values, neither of which would be helped by this composite index; an index on the date/times alone would be sufficient, in addition to the PK index that is already extant.
Two further points:
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Probably, but because the date/time is two separate columns (rather than a TIMESTAMP) only the date column will be used if the query contains an inequality condition that references the date field.
Glenn, thank you. Including the pk seems redundant considering your comments and the fact that we don't narrow queries further by pk if we're querying by (equality or range of) date and time. In our table other columns now seem much better candidates, particularly when considering index-only retrieval.
Any disadvantages to leaving pk in? Does it affect physical size or layout of index?
The index consultant was silent on the queries I tried.
Disadvantages? Sure - increased index key size means fewer entries per page, and the reduced fanout may require the index to be an extra level, which would mean a (potential) additional extra I/O for every index lookup.
The index would also have to be maintained if the PK value changes, though this is unlikely since it is rarely if ever good practice to change primary keys.
As for the index consultant - the index consultant does not make recommendations about PK or FK indexes. Were any indexes listed as "unused" after you ran your workload?
Well, I would think that particular index might make sense if the following sentence (taken from the doc page you refered to) is true:
A composite index is useful if the first column alone does not provide high selectivity.
In your case, it would make sense if there are lots of entries with identical date and time values, and you would frequently select those and have to filter (or order) them further based on the pk. But then the index order would be important, too - i.e. if you sort by date/time chronologically and would use the pk in ascending order, too, the ASC/DESC order in the index columns should be set accordingly.
However, I would expect that an index without the pk column would be nearly as good for that particular situation, as the engine could additionally use the default pk index.
In case you don't use that particular filtering/sorting, I would recommend an index without the pk column (and would usually recommend to use a datetime/timestamp field instead of separate date/time columns).
As always: Some according tests with typical workloads (including looking at the plans) should show the advantages/disadvantages of the different index implementations. You might also use the Index Consultant.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
User | Count |
---|---|
76 | |
30 | |
10 | |
8 | |
8 | |
7 | |
7 | |
5 | |
4 | |
4 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.