on 2014 Mar 25 12:19 PM
i posted a question about JOINS awhile back.
i thought i understood the answer, but i'm still struggling.
results returned are not what is expected
FROM order_header oh join order_line_items oli on oh.order_seq_num = oli.order_seq_num join customer c on oh.customer_id = c.customer_id join customer_control cc on oh.customer_id = cc.customer_id join salesman s on cc_salesman_code = s.salesman_code left outer join inventory_master im on oli.inventory_seq_num = im.seq_num WHERE oh.location_number = 0 and order_type <> 'fuel' and oli.mf_tax = 'n' and oli.bol_required = 'n' and oli.drum_deposit in ( 'y', 'n', 'z' ) and im.part_number not in ( '10098', '10099', '10032' ) and inventory_seq_num <> 1000099 and oh.status = '4' and payment_date >= :start_date AND payment_date < :before_date
results returned are what is expected
FROM order_header oh join order_line_items oli on oh.order_seq_num = oli.order_seq_num and oh.location_number = 0 and order_type <> 'fuel' join customer c on oh.customer_id = c.customer_id join customer_control cc on oh.customer_id = cc.customer_id join salesman s on cc_salesman_code = s.salesman_code left outer join inventory_master im on oli.inventory_seq_num = im.seq_num and oli.mf_tax = 'n' and oli.bol_required = 'n' and oli.drum_deposit in ( 'y', 'n', 'z' ) and im.part_number not in ( '10098', '10099', '10032' ) and inventory_seq_num <> 1000099 WHERE invoiced_on_date >= :start_date AND invoiced_on_date < :before_date
i thought the previous answer was to JOIN the tables based on indexes or keys
and to put the qualifiers in the where clause (results returned are not what is expected)
the LEFT OUTER JOIN is not returning what i expect.
when i put the qualifiers in the LEFT OUTER JOIN, i get the results i'm expecting.
(results returned are what is expected)
i only have 6 hairs left, please hurry with an explanation before they're all gone.
See also: Outer Join Not Behaving As Expected
Caveat: The following code has not been tested.
Step 1: Recode the query so the patterns become clear...
FROM order_header oh join order_line_items oli on oh.order_seq_num = oli.order_seq_num join customer c on oh.customer_id = c.customer_id join customer_control cc on oh.customer_id = cc.customer_id join salesman s on cc_salesman_code = s.salesman_code left outer join inventory_master im on oli.inventory_seq_num = im.seq_num WHERE oh.location_number = 0 and order_type <> 'fuel' and oli.mf_tax = 'n' and oli.bol_required = 'n' and oli.drum_deposit in ( 'y', 'n', 'z' ) and im.part_number not in ( '10098', '10099', '10032' ) and inventory_seq_num <> 1000099 and oh.status = '4' and payment_date >= :start_date AND payment_date < :before_date
Step 2: Recognize that predicate "im.part_number not in ( '10098', '10099', '10032' )" is in the WHERE clause, and as such is logically applied after the FROM clause has finished its work.
If the FROM clause resulted in one or more "candidate rows" that have NULL in im.part_number, those rows will be eliminated by the WHERE clause because "im.part_number not in ( '10098', '10099', '10032' )" evaluates to UNKNOWN (neither TRUE nor FALSE).
...and UNKNOWN is pretty much the same as FALSE when it comes to a WHERE clause. Specifically stated, the predicate "im.part_number not in ( '10098', '10099', '10032' ) IS TRUE" evaluates as FALSE when im.part_number is NULL, and because of that the row is eliminated.
...ah, the joys of three-value logic, the Hepatitis C of relational databases 🙂
Step 3: Move the predicate "im.part_number not in ( '10098', '10099', '10032' )" somewhere it will be evaluated before the "left outer join" operator is finished its work.
One place that sometimes works is the ON clause, but that is dangerous because sometimes it does not work properly... better to save the ON clause for "join predicates" that involve columns in both tables, because that's what a join is, an operator involving two tables.
A better place for a predicate involving one table is in a derived table expression that is logically evaluated before the "left outer join" operator is evaluated.
Thus...
FROM order_header oh join order_line_items oli on oh.order_seq_num = oli.order_seq_num join customer c on oh.customer_id = c.customer_id join customer_control cc on oh.customer_id = cc.customer_id join salesman s on cc_salesman_code = s.salesman_code left outer join ( SELECT * FROM inventory_master im WHERE im.part_number not in ( '10098', '10099', '10032' ) ) im on oli.inventory_seq_num = im.seq_num WHERE oh.location_number = 0 and order_type <> 'fuel' and oli.mf_tax = 'n' and oli.bol_required = 'n' and oli.drum_deposit in ( 'y', 'n', 'z' ) /* and im.part_number not in ( '10098', '10099', '10032' ) */ and inventory_seq_num <> 1000099 and oh.status = '4' and payment_date >= :start_date AND payment_date < :before_date
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Something to consider, yes. However, I guess the effect will depend on the statistics/estimates, i.e. I hope the optimizer will try to evaluate OR'ed conditions "later" in the process...
Or you would explicitly make use of three-valued logic via
WHERE ... and im.part_number in ( '10098', '10099', '10032' ) is false ...
Unfortunately, not that easy to understand, methinks:)
Breck, thanks for the derived table answer, it workded.
Testing showed no major performance impact. The inventory table has approx 6000 rows and 100+ columns. I thought there should be a hit on how long it took for the query to run. Does ASA do something in the background to only return the column I'm joining?
Now how do i get my questions to look as pretty as yours?
> Does ASA do something in the background
Indeed it does... that's why I carefully inserted the word "logical" in my answer. What the query engine actually does bears little relationship to the way the query is actually coded, other than producing the expected end result. Anyone (other than perhaps a handful of engineers on the query engine team) who claims to be able to look at a query and predict whether it will be fast or slow is a ... wait for it ... complete fool, surpassed in foolishness only by someone who believes the claim 🙂
> not that easy to understand
Not to mention being wrong. Suppose im.part_number = '10098'. The IN will be TRUE, so the IS FALSE will be FALSE, and the row will be eliminated 🙂
The OR works because TRUE OR UNKNOWN is TRUE.
Three-value logic is an incurable disease, a chronic sickness that can be mitigated but never cured 🙂
Hm, are you sure the condition is wrong?
In my understanding, a row with that part_number should be omitted - my condition does use "in (...) is false" whereas the original condition is "is null or not in (...)". I would consider this a logically equivalent expression.
Besides that: I should possibly not post about three-valued logic around midnight, as I've done yesterday...
> are you sure the condition is wrong?
Pretty sure, although my explanation is also wrong 🙂
Let's try again...
The original predicate was "im.part_number not in ( '10098', '10099', '10032' )"
(I forgot it was NOT IN rather than IN)
Anyway...
Suppose im.part_number is NULL because of the LEFT OUTER JOIN... the obvious desire is to still include the row.
(Never mind the case where im.part_number is NULL because, well, the column contains a NULL... nobody uses NULL as actual column values, do they? 🙂
The proposed predicate is "im.part_number in ( '10098', '10099', '10032' ) is false".
The result of the IN will be UNKNOWN, which means the result of the IS FALSE will be FALSE.
If you want it to be TRUE, then maybe the predicate should be "im.part_number in ( '10098', '10099', '10032' ) is not false"... in that case, the IN is UNKNOWN, which means the result of IS NOT FALSE is TRUE.
Sadly, that ONLY makes it work when im.part_number is NULL, but you already had "im.part_number IS NULL" in the original predicate with the OR.
The IS NOT FALSE solution does NOT work when im.part_number is not NULL... it is the opposite of what you want (FALSE when im.part_number = '99999', and TRUE when it is '10098').
> not that easy to understand
Correct. Absolutely, positively, one hundred percent correct.
User | Count |
---|---|
71 | |
10 | |
8 | |
7 | |
7 | |
6 | |
6 | |
6 | |
6 | |
5 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.