Solved: Re: Select statement Table Join sequence (Performa...

Former Member · ‎2006 Feb 16

Hi,

I have 3 tables Header (Hdr), Item (Itm) and Schedule Line (Sdl) table. The number of entries is max in Sdl table (Eg: Hdr = 1000; Item = 20000 and Sdl = 200000 records). I need to join these 3 tables in my SELECT statement to fetch the required records.

Case A:

-

Header AS HDR

JOIN Item AS ITM

ON HDR_{ORDID = ITM}ORDID

AND HDR_{VRSIOID = ITM}VRSIOID

JOIN ScheduleLine AS SDL

ON ITM_{ITMID = SDL}ITMID

Case B:

-

ScheduleLine AS SDL

JOIN Item AS ITM

ON ITM_{ITMID = SDL}ITMID

JOIN Header AS HDR

ON HDR_{ORDID = ITM}ORDID

AND HDR_{VRSIOID = ITM}VRSIOID

Questions:

-

1) The performance of the Select query is really bad with Case A and very good (really fast) with Case B. Will the number of records in the tables used determine the sequence of join in the Select statement ?

2) If yes, is this independent of database used ? That is, will the behaviour be the same in all databases (Eg: Oracle, DB2, SQL Server, etc) ?

Appreciate your help.

Thanks,

Karthik

Former Member · ‎2006 Feb 16

1) The performance of the Select query is really bad with Case A and very good (really fast) with Case B. Will the number of records in the tables used determine the sequence of join in the Select statement ?

Yes. Since the matching records are less in case of join of Hdr and Item.

2) If yes, is this independent of database used ? That is, will the behaviour be the same in all databases (Eg: Oracle, DB2, SQL Server, etc) ?

It may also depend on a variety of factors like the keys used, buffering etc on the database used.

Former Member · ‎2006 Feb 16

1) The performance of the Select query is really bad with Case A and very good (really fast) with Case B. Will the number of records in the tables used determine the sequence of join in the Select statement ?

Yes. Since the matching records are less in case of join of Hdr and Item.

2) If yes, is this independent of database used ? That is, will the behaviour be the same in all databases (Eg: Oracle, DB2, SQL Server, etc) ?

It may also depend on a variety of factors like the keys used, buffering etc on the database used.

hymavathi_oruganti · ‎2006 Feb 16

yes, what suggested above is right i feel.

former_member186741 · ‎2006 Feb 16

this isn't addressing your question and I don't know the structure of your tables so I could be wrong, but I would have expected the join to have been coded like this:

Header AS HDR

JOIN Item AS ITM

ON HDR_{ORDID = ITM}ORDID

AND HDR_{VRSIOID = ITM}VRSIOID

JOIN ScheduleLine AS SDL

ON HDR_{ORDID = SDL}ORDID

AND HDR_{VRSIOID = SDL}VRSIOID

AND ITM_{ITMID = SDL}ITMID

......

surely the scheule join needs to specify the header key fields too (unless the item id is absolutely unique)??

Former Member · ‎2006 Feb 16

Karthik,

Aren't we missing the where clause?

The sequence of join is usually dictated by the table you want to clamp records on.

As to your second point, if the same indexes have been created, then the same behavior can be expected of different databases. Over a period of time they level out.

santosh_patil6 · ‎2006 Feb 16

One more point to add is

the order of field comparisons in the WHERE clause does add to the query efficiency.

former_member186741 · ‎2006 Feb 16

I believe that the performance discrepancy between the two versions is caused by you not specifying the joins correctly (i.e. not qualifying the header and item at schedule level as I mentioned previously).

CAse A.

-

Header AS HDR

JOIN Item AS ITM

ON HDR_{ORDID = ITM}ORDID

AND HDR_{VRSIOID = ITM}VRSIOID

JOIN ScheduleLine AS SDL

ON ITM_{ITMID = SDL}ITMID

reads all 1000 hDRs, joins all 20000 items correctly to these BUT then joins schedule lines only by item id so there will be a very large number of matches. i.e. any item with itemid 10 will match to EVERY SDL with itemid 10. as there are 20000 items and 200000 schedule lines this will be massive!

Case B:

-

ScheduleLine AS SDL

JOIN Item AS ITM

ON ITM_{ITMID = SDL}ITMID

JOIN Header AS HDR

ON HDR_{ORDID = ITM}ORDID

AND HDR_{VRSIOID = ITM}VRSIOID

This will not have the same amount of DB traffic as the SDL is now the driving force. It will still match more than you intend but will only be reading the items redundantly trather than the more voluminous schedules.

................................

Basically, your sql is wrong and should qualify the matches right down to schedule line level as per my earlier post. When you have changed the sql, try the two alternatives again and see if there is any performance differences. I suspect that there will not be, but it will be interesting to find out for sure.

Former Member · ‎2006 Feb 17

Hi Neil,

Actually I had not specified all the where conditions. Below is the complete query.

SELECT SDL_{ORDID SDL}ITMID FROM

Header AS HDR JOIN Item AS ITM

ON HDR_{ORDID = ITM}ORDID

AND HDR_{VRSIOID = ITM}VRSIOID

JOIN ScheduleLine AS SDL

ON ITM_{ITMID = SDL}ITMID

APPENDING CORRESPONDING FIELDS OF TABLE et_ordid

FOR ALL ENTRIES IN it_itmmatlocto

WHERE

itm~vrsioid = iv_vrsioid

AND itm~ortype = iv_ortype

AND itm~matid = it_itmmatlocto-matid

AND itm~locidto = it_itmmatlocto-locidto

AND HDR~PRTIDFR IN IT_HDRPRTFR

AND HDR~PRTIDTO IN IT_HDRPRTTO

AND SDL~ORDNO IN IT_ORDNO0

Here all the internal tables (it_itmmatlocto, IT_HDRPRTFR, IT_HDRPRTTO and IT_ORDNO0) has proper values which will reduce the selected records. Even though th selection is reduced WHERE condition, there is significant difference between Case A and Case B. As said earlier, Case A times out and Case B gets executed immediately.

Former Member · ‎2006 Feb 17

Hi Karthik,

Is your question answered.

If yes, reward points and close the thread.

former_member186741 · ‎2006 Feb 17

yes, but I think the performance will improve if you specify the JOIN of the schedule lines in more detail.

ON ITM_{ITMID = SDL}ITMID

should be

ON SDL_{ORDID = ITM}ORDID

AND SDL_{VRSIOID = ITM}VRSIOID

and ITM_{ITMID = SDL}ITMID

...

joining only on itmid means there will be thousands of inncorrect matches.

Former Member · ‎2006 Feb 17

If that is the case why is the performance Case B really fast. That is changing the Join Order sequence improves the performance.

If what you say is the reason then in both cases the performance should be really slow.

former_member186741 · ‎2006 Feb 17

it makes sense that case b is faster than case a.

In case A the db system has to do multiple passes of the most voluminous table (SDL) to find the matches. In case b it is still wrong but will have to do less db passes because SDL is only passed once and the smaller tables are matched to it.

Change your joins and watch the improvement.

Imagine we were not talking about sql but abap loops instead. If we had:

case A.

loop at hdr.

loop at itm.

loop at sdl.

endloop.

case B.

loop at sdl.

loop at itm.

loop at hdr.

endloop.

Case A will be a lot slower than Case B and for similar reasons to the sql. CAse B only has to loop through the biggest table once.

By Category

Related Content

Activity Groups

Industry Groups

Influence and Feedback Groups

Interest Groups

Location Groups

Customer Only Groups

Forums

Related Resources

Products

Learning and Support

About

My SAP Profile

My SAP Profile

Select statement Table Join sequence (Performance)