cancel
Showing results for 
Search instead for 
Did you mean: 

still struggling with outer JOINS

t1950
Participant
2,145

i posted a question about JOINS awhile back.
i thought i understood the answer, but i'm still struggling.

results returned are not what is expected

  
FROM order_header oh join order_line_items oli on oh.order_seq_num = oli.order_seq_num  
    join customer c on oh.customer_id = c.customer_id  
    join customer_control cc on oh.customer_id = cc.customer_id  
    join salesman s on cc_salesman_code = s.salesman_code  
    left outer join inventory_master im on oli.inventory_seq_num = im.seq_num   
WHERE oh.location_number = 0 and order_type <> 'fuel'  
    and oli.mf_tax = 'n' and oli.bol_required = 'n'  
    and oli.drum_deposit in ( 'y', 'n', 'z' )  
    and im.part_number not in ( '10098', '10099', '10032' )  
    and inventory_seq_num <> 1000099  
    and oh.status = '4'  
    and payment_date >= :start_date AND payment_date < :before_date  

results returned are what is expected

FROM order_header oh join order_line_items oli on oh.order_seq_num = oli.order_seq_num  
            and oh.location_number = 0 and order_type <> 'fuel'  
    join customer c on oh.customer_id = c.customer_id  
    join customer_control cc on oh.customer_id = cc.customer_id  
    join salesman s on cc_salesman_code = s.salesman_code  
    left outer join inventory_master im on oli.inventory_seq_num = im.seq_num  
            and oli.mf_tax = 'n' and oli.bol_required = 'n'  
            and oli.drum_deposit in ( 'y', 'n', 'z' )  
            and im.part_number not in ( '10098', '10099', '10032' )  
            and inventory_seq_num <> 1000099  
WHERE invoiced_on_date >= :start_date AND invoiced_on_date < :before_date 

i thought the previous answer was to JOIN the tables based on indexes or keys
and to put the qualifiers in the where clause (results returned are not what is expected)
the LEFT OUTER JOIN is not returning what i expect.

when i put the qualifiers in the LEFT OUTER JOIN, i get the results i'm expecting.
(results returned are what is expected)

i only have 6 hairs left, please hurry with an explanation before they're all gone.

Accepted Solutions (1)

Accepted Solutions (1)

Breck_Carter
Participant

See also: Outer Join Not Behaving As Expected

Caveat: The following code has not been tested.

Step 1: Recode the query so the patterns become clear...

FROM order_header oh 
        join            order_line_items oli on oh.order_seq_num = oli.order_seq_num
        join            customer         c   on oh.customer_id = c.customer_id
        join            customer_control cc  on oh.customer_id = cc.customer_id
        join            salesman         s   on cc_salesman_code = s.salesman_code
        left outer join inventory_master im  on oli.inventory_seq_num = im.seq_num 
 WHERE oh.location_number = 0 
   and order_type <> 'fuel'
   and oli.mf_tax = 'n' and oli.bol_required = 'n'
   and oli.drum_deposit in ( 'y', 'n', 'z' )
   and im.part_number not in ( '10098', '10099', '10032' )
   and inventory_seq_num <> 1000099
   and oh.status = '4'
   and payment_date >= :start_date 
   AND payment_date < :before_date

Step 2: Recognize that predicate "im.part_number not in ( '10098', '10099', '10032' )" is in the WHERE clause, and as such is logically applied after the FROM clause has finished its work.

If the FROM clause resulted in one or more "candidate rows" that have NULL in im.part_number, those rows will be eliminated by the WHERE clause because "im.part_number not in ( '10098', '10099', '10032' )" evaluates to UNKNOWN (neither TRUE nor FALSE).

...and UNKNOWN is pretty much the same as FALSE when it comes to a WHERE clause. Specifically stated, the predicate "im.part_number not in ( '10098', '10099', '10032' ) IS TRUE" evaluates as FALSE when im.part_number is NULL, and because of that the row is eliminated.

...ah, the joys of three-value logic, the Hepatitis C of relational databases 🙂

Step 3: Move the predicate "im.part_number not in ( '10098', '10099', '10032' )" somewhere it will be evaluated before the "left outer join" operator is finished its work.

One place that sometimes works is the ON clause, but that is dangerous because sometimes it does not work properly... better to save the ON clause for "join predicates" that involve columns in both tables, because that's what a join is, an operator involving two tables.

A better place for a predicate involving one table is in a derived table expression that is logically evaluated before the "left outer join" operator is evaluated.

Thus...

FROM order_header oh 
        join            order_line_items oli on oh.order_seq_num = oli.order_seq_num
        join            customer         c   on oh.customer_id = c.customer_id
        join            customer_control cc  on oh.customer_id = cc.customer_id
        join            salesman         s   on cc_salesman_code = s.salesman_code
        left outer join ( 
                          SELECT *
                            FROM inventory_master im  
                           WHERE im.part_number not in ( '10098', '10099', '10032' )
                        ) im on oli.inventory_seq_num = im.seq_num 
 WHERE oh.location_number = 0 
   and order_type <> 'fuel'
   and oli.mf_tax = 'n' and oli.bol_required = 'n'
   and oli.drum_deposit in ( 'y', 'n', 'z' )
/*   and im.part_number not in ( '10098', '10099', '10032' )   */
   and inventory_seq_num <> 1000099
   and oh.status = '4'
   and payment_date >= :start_date 
   AND payment_date < :before_date
VolkerBarth
Contributor
0 Kudos

Another method to prevent a WHERE condition from silently turning a LEFT JOIN into an INNER JOIN is to allow NULL in the condition on the NULL-supplying side, something like

WHERE ...
  and (im.part_number is null or im.part_number not in ( '10098', '10099', '10032' ))
  ...
Breck_Carter
Participant
0 Kudos

Yes, indeed... but my fingers shake uncontrollably when I try to type O... O... the Evil Twin operator, the slowwww operator, the one that isn't AND 🙂

VolkerBarth
Contributor
0 Kudos

Something to consider, yes. However, I guess the effect will depend on the statistics/estimates, i.e. I hope the optimizer will try to evaluate OR'ed conditions "later" in the process...

Or you would explicitly make use of three-valued logic via

WHERE ...
  and im.part_number in ( '10098', '10099', '10032' ) is false
  ...

Unfortunately, not that easy to understand, methinks:)

t1950
Participant
0 Kudos

Breck, thanks for the derived table answer, it workded.

Testing showed no major performance impact. The inventory table has approx 6000 rows and 100+ columns. I thought there should be a hit on how long it took for the query to run. Does ASA do something in the background to only return the column I'm joining?

Now how do i get my questions to look as pretty as yours?

Breck_Carter
Participant
0 Kudos

> Does ASA do something in the background

Indeed it does... that's why I carefully inserted the word "logical" in my answer. What the query engine actually does bears little relationship to the way the query is actually coded, other than producing the expected end result. Anyone (other than perhaps a handful of engineers on the query engine team) who claims to be able to look at a query and predict whether it will be fast or slow is a ... wait for it ... complete fool, surpassed in foolishness only by someone who believes the claim 🙂

Breck_Carter
Participant
0 Kudos

> not that easy to understand

Not to mention being wrong. Suppose im.part_number = '10098'. The IN will be TRUE, so the IS FALSE will be FALSE, and the row will be eliminated 🙂

The OR works because TRUE OR UNKNOWN is TRUE.

Three-value logic is an incurable disease, a chronic sickness that can be mitigated but never cured 🙂

Breck_Carter
Participant
0 Kudos

> how do i get my questions to look as pretty

"Pretty" is a stretch 🙂

Old-school HTML, perhaps. A blank line then a PRE tag before code, a /PRE tag followed by a blank line after the code, works wonders... look at your question again in EDIT mode to see what I did to it.

VolkerBarth
Contributor
0 Kudos

Hm, are you sure the condition is wrong?

In my understanding, a row with that part_number should be omitted - my condition does use "in (...) is false" whereas the original condition is "is null or not in (...)". I would consider this a logically equivalent expression.

Besides that: I should possibly not post about three-valued logic around midnight, as I've done yesterday...

Breck_Carter
Participant
0 Kudos

> are you sure the condition is wrong?

Pretty sure, although my explanation is also wrong 🙂

Let's try again...

The original predicate was "im.part_number not in ( '10098', '10099', '10032' )"

(I forgot it was NOT IN rather than IN)

Anyway...

Suppose im.part_number is NULL because of the LEFT OUTER JOIN... the obvious desire is to still include the row.

(Never mind the case where im.part_number is NULL because, well, the column contains a NULL... nobody uses NULL as actual column values, do they? 🙂

The proposed predicate is "im.part_number in ( '10098', '10099', '10032' ) is false".

The result of the IN will be UNKNOWN, which means the result of the IS FALSE will be FALSE.

If you want it to be TRUE, then maybe the predicate should be "im.part_number in ( '10098', '10099', '10032' ) is not false"... in that case, the IN is UNKNOWN, which means the result of IS NOT FALSE is TRUE.

Sadly, that ONLY makes it work when im.part_number is NULL, but you already had "im.part_number IS NULL" in the original predicate with the OR.

The IS NOT FALSE solution does NOT work when im.part_number is not NULL... it is the opposite of what you want (FALSE when im.part_number = '99999', and TRUE when it is '10098').

> not that easy to understand

Correct. Absolutely, positively, one hundred percent correct.

Answers (0)