on 2011 May 30 8:14 AM
It's monday morning, and I'm somewhat puzzled with a LEFT JOIN with a combined join condition...
Say, I would use the SQL Anywhere 12 demo database to list all customer pairs with identical postal code, and would also list those customers who do not have a "paired" customer. (Yes, this is a very constructed sample but I'm trying to do something similar but more useful in my real database...):
So, as a first step I would query the pairs (here additionally restricting C1.ID to limit the result set):
select C1.PostalCode, C1.ID, C1.Surname, C2.ID, C2.Surname from Customers C1 inner join Customers C2 on C1.PostalCode = C2.PostalCode and C1.ID < C2.ID and C1.ID < 130 order by C1.ID, C2.ID
This lists
PostalCode,ID,Surname,ID,Surname 46032,105,McCarthy,121,Elkins 27695-7209,107,Colburn,118,Sanford R3C 3V6,123,Lin,143,Piper
Now, as second step, I simply turn the INNER JOIN into a LEFT JOIN to get those customers who don't have a "pair":
select C1.PostalCode, C1.ID, C1.Surname, C2.ID, C2.Surname from Customers C1 left outer join Customers C2 on C1.PostalCode = C2.PostalCode and C1.ID < C2.ID and C1.ID < 130 order by C1.ID, C2.ID
This lists
PostalCode,ID,Surname,ID,Surname 07070,101,Devlin,, 10154,102,Reiser,, 19301,103,Niedringhaus,, 37919,104,Mason,, 46032,105,McCarthy,121,Elkins 64579,106,Phillips,, 27695-7209,107,Colburn,118,Sanford 37421,108,Goforth,, ... R3C 3V6,123,Lin,143,Piper ... 53141,130,Monella,, 11716,131,Sinnot,, ... 11700,665,Thompson,,
So this does add customers who do not have a pair w.r.t. postal code, but it also adds those with ID >= 130, although the join condition should prevent that in my understanding.
I have tested both with v12.0.1.3324 and 8.0.3 with identical results.
So it seems in order to filter on C1.ID, I have to put the condition C1.ID < 130 in the WHERE clause:
select C1.PostalCode, C1.ID, C1.Surname, C2.ID, C2.Surname from Customers C1 left outer join Customers C2 on C1.PostalCode = C2.PostalCode and C1.ID < C2.ID where C1.ID < 130 -- re-organized filter on C1 order by C1.ID, C2.ID
However, in order to filter on C2.ID, I still have to put the according condition C2.ID < 130 in the ON clause in order to retain the OUTER JOIN semantics (placing that in the WHERE clause is a common error that turns the join into an INNER JOIN automatically...):
select C1.PostalCode, C1.ID, C1.Surname, C2.ID, C2.Surname from Customers C1 left outer join Customers C2 on C1.PostalCode = C2.PostalCode and C1.ID < C2.ID and C2.ID < 130 order by C1.ID, C2.ID
Question:
Is this asymmetrical behaviour between further conditions for the preserved and the null-supplying side by design?
Well, after asking my own question the answer seems straight-forward (possibly because it's noon in the meantime...):
Yes, there's this asymmetrical behaviour: When doing a OUTER JOIN with C1 as preserved table, you're asking
So, if you want to filter C1 generally, you will have to apply that filter in the WHERE clause.
You may or may have to apply the filter in the ON condition, too, in case it has influence on whether C2 has matching rows or not. In the samples above, that's unnecessary.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
[This was originally a comment to Breck's other answer.]
What do you mean exactly, "asymmetrical behavior"? IMO you should only put join conditions involving both tables in the ON clause, never selection conditions involving only one table... put those in a derived table if you want them evaluated before the join, in the outer WHERE if you want them evaluated after the join.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Breck, I guess you have given a very useful rule of thumbs with the recommendation to use derived tables or WHERE clauses for the preserved table. I'm gonna try to remember this...
The reason for my original wrong expectation has to do with the particular left join being a self-join and the impression that I wanted to join particular "types of rows" from instance A with their "partners" from instance B - or without them if they had no "partner". As it is a self-join, I felt (and still feel) that it seems somewhat asymmetrical to put the "type filter" for A in a WHERE clause (or derived table) and the type filter for B in the JOIN clause.
But I do understand that this is the way left joins work:)
select C1.PostalCode, C1.ID, C1.Surname, C2.ID, C2.Surname from ( select * from Customers where Customers.ID < 130 ) C1 left outer join Customers C2 on C1.PostalCode = C2.PostalCode and C1.ID < C2.ID order by C1.ID, C2.ID
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
User | Count |
---|---|
71 | |
10 | |
8 | |
7 | |
7 | |
6 | |
6 | |
6 | |
6 | |
5 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.