on 2012 Jan 27 3:20 PM
The help gives some examples of what constitutes an inappropriate use of an aggregate function :
Probable cause
You attempted to use an aggregate function inappropriately. For example, a SELECT statement cannot contain an aggregate function within a predicate in the WHERE clause unless the aggregate function constitutes an outer reference. Other invalid uses include specifying an aggregate function in the SET clause of an UPDATE statement, or using an aggregate function in a CHECK constraint.
I can certainly think of situations where you would get undesirable results, but I'm hoping someone could provide a better explanation as to the "Why" these uses are inappropriate.
The most straightforward example is to use an aggregate function in a WHERE clause, ie
SELECT FROM
WHERE COUNT() > 5 GROUP BY <table.column>In SQL, a SELECT statement's semantics are (roughly) as follows:
I said "roughly" because grouping, semantically-wise, is actually a fairly complicated thing to describe completely and precisely, particularly due to the difference in results when the query has a GROUP BY clause, and when it does not. Chris Date, in his book An Introduction to Database Systems, has a number of really lovely examples where he takes those semantics to task, but SQL is what it is...
Anyway - from those semantics you can see that including COUNT() in a WHERE clause is semantically invalid - the value of COUNT(*) is computed during the grouping step, so trying to apply the predicate beforehand during the selection (filtering) process is incorrect. Instead, one should move the predicate to a HAVING clause, so that the condition is tested after the groups are formed:
SELECT FROM
When the language includes WINDOW functions (and SQL Anywhere does support WINDOW queries) then the semantics are even more difficult to define and explain, but they do make sense - if you don't try to think about it too hard - even though I've yet to see a formal definition (in the mathematical sense) of WINDOW.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
UPDATE t SET SUM ( c ) = 0;
...how would you like that done? Set c to zero in every row? Set alternating rows to random positive/negative numbers so the sum remains zero?
CREATE TABLE t ( c INTEGER CHECK ( AVG ( c ) = 0 ) ...
...what is the aggregation over which the AVG is calculated? Is it recalculated and rechecked with each row insertion/update/deletion?
It is likely that each case where that error message appears is somewhat different. Some may fall into the category "Engineering hasn't figured out how to support this usage" but most are probably "nonsensical impossible to calculate".
Perhaps you have run into a situation where the message doesn't seem right; if so, please share it with us.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
I run into the situation infrequently and most of the time the message seems right and I understand the logic, but in some situations I think it would still work.
One example: I have a one to many relationship between two tables, such as a bill and its line items.
I might try to write a procedure that totals the line items for a field on the bill with a statement that says:
update Bill left outer join BillItems on BillItems.BillID = Bill.BillID set Bill.Total = sum(BillItems.Total) where BillID = X
The way I read the above, it would update the Bill.Total with the total of the list from the join, doing it each time for each item in the list, but I would not care because it would be the same total and I would not need a second statement to pull the total into a variable and then update.
There is certainly the argument that two statements are better than N updates without knowing how many are in the list, but it seems like the above is writing what I want to happen.
Probably more of an exception that would behave correctly but I thought I would get some more information.
The way that statement reads, it is understandable what the goal may be.
As far as why it doesn't work (is prohibited)...
First, consider what would happen if you executed...
SELECT SUM(BillItems.Total) FROM Bill LEFT OUTER JOIN BillItems ON Bill.BillID = BillItems.BillID
...you'd get the total of all BillItems including those not on the current bill. Without a GROUP BY, the aggregate will hit all rows. There's no easily defined way to specify GROUP BY and integrate it into the UPDATE statement, so you need to get creative:
UPDATE Bill b LEFT OUTER JOIN (SELECT BillID, SUM(Total) FROM BillItems GROUP BY BillID) bi ON b.BillID = bi.BillID SET b.Total = bi.Total WHERE b.BillID = X
An easier (though possibly less performant) way would be to use a correlated subquery, such as (not tested!):
UPDATE Bill b SET Total = (SELECT SUM(Total) FROM BillItems bi WHERE bi.BillID = b.BillID) WHERE BillID = X
User | Count |
---|---|
57 | |
10 | |
7 | |
7 | |
6 | |
6 | |
5 | |
5 | |
5 | |
5 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.