cancel
Showing results for 
Search instead for 
Did you mean: 

Select Distinct vs Group By

14,420

In general, will you see faster performance selecting distinct or selecting and then grouping by each column to achieve the same results?

Probably depends on the data I know, but just looking for general pointers and factors to consider one way or the other.

Accepted Solutions (1)

Accepted Solutions (1)

Former Member

I would not rewrite SELECT DISTINCT as a grouped query, for two main reasons.

In releases of SQL Anywhere older than Version 8, distinct and grouped queries had precisely the same implementation and their execution characteristics were identical. Starting with Version 8, the server constructs different DFO operators for the operations, however much of the underlying implementation remains the same (using shared code) and hence I would be surprised if there were substantial performance differences between the two.

The reasons I would refrain from rewriting SELECT DISTINCT as a grouped query are:

  1. From an application programming standpoint, SELECT DISTINCT is more straightforward and reflects the intent of the query. A grouped version may cause subsequent confusion at a later time.
  2. Perhaps more importantly, there may be differences in how complex queries involving grouping are rewritten during semantic rewrite optimization. Rewrite optimization is based largely on the discovery of patterns, and it would not be surprising to find that a complex query using SELECT DISTINCT is rewritten differently than its equivalent that uses GROUP BY.

The latter item can be substantially important with respect to how complex queries are simplified by the query optimizer, and for that reason I would retain the syntax of the intended semantics (of SELECT DISTINCT) rather than try to do anything fancier.

Answers (0)