Application Development and Automation Discussions
Join the discussions or start your own on all things application development, including tools and APIs, programming models, and keeping your skills sharp.
cancel
Showing results for 
Search instead for 
Did you mean: 
Read only

Select statement Table Join sequence (Performance)

Former Member
0 Likes
1,357

Hi,

I have 3 tables Header (Hdr), Item (Itm) and Schedule Line (Sdl) table. The number of entries is max in Sdl table (Eg: Hdr = 1000; Item = 20000 and Sdl = 200000 records). I need to join these 3 tables in my SELECT statement to fetch the required records.

Case A:

-


Header AS HDR

JOIN Item AS ITM

ON HDRORDID = ITMORDID

AND HDRVRSIOID = ITMVRSIOID

JOIN ScheduleLine AS SDL

ON ITMITMID = SDLITMID

Case B:

-


ScheduleLine AS SDL

JOIN Item AS ITM

ON ITMITMID = SDLITMID

JOIN Header AS HDR

ON HDRORDID = ITMORDID

AND HDRVRSIOID = ITMVRSIOID

Questions:

-


1) The performance of the Select query is really bad with Case A and very good (really fast) with Case B. Will the number of records in the tables used determine the sequence of join in the Select statement ?

2) If yes, is this independent of database used ? That is, will the behaviour be the same in all databases (Eg: Oracle, DB2, SQL Server, etc) ?

Appreciate your help.

Thanks,

Karthik

1 ACCEPTED SOLUTION
Read only

Former Member
0 Likes
1,271

1) The performance of the Select query is really bad with Case A and very good (really fast) with Case B. Will the number of records in the tables used determine the sequence of join in the Select statement ?

Yes. Since the matching records are less in case of join of Hdr and Item.

2) If yes, is this independent of database used ? That is, will the behaviour be the same in all databases (Eg: Oracle, DB2, SQL Server, etc) ?

It may also depend on a variety of factors like the keys used, buffering etc on the database used.

11 REPLIES 11
Read only

Former Member
0 Likes
1,272

1) The performance of the Select query is really bad with Case A and very good (really fast) with Case B. Will the number of records in the tables used determine the sequence of join in the Select statement ?

Yes. Since the matching records are less in case of join of Hdr and Item.

2) If yes, is this independent of database used ? That is, will the behaviour be the same in all databases (Eg: Oracle, DB2, SQL Server, etc) ?

It may also depend on a variety of factors like the keys used, buffering etc on the database used.

Read only

hymavathi_oruganti
Active Contributor
0 Likes
1,271

yes, what suggested above is right i feel.

Read only

former_member186741
Active Contributor
0 Likes
1,271

this isn't addressing your question and I don't know the structure of your tables so I could be wrong, but I would have expected the join to have been coded like this:

Header AS HDR

JOIN Item AS ITM

ON HDRORDID = ITMORDID

AND HDRVRSIOID = ITMVRSIOID

JOIN ScheduleLine AS SDL

ON HDRORDID = SDLORDID

AND HDRVRSIOID = SDLVRSIOID

AND ITMITMID = SDLITMID

......

surely the scheule join needs to specify the header key fields too (unless the item id is absolutely unique)??

Read only

Former Member
0 Likes
1,271

Karthik,

Aren't we missing the where clause?

The sequence of join is usually dictated by the table you want to clamp records on.

As to your second point, if the same indexes have been created, then the same behavior can be expected of different databases. Over a period of time they level out.

Read only

0 Likes
1,271

One more point to add is

the order of field comparisons in the WHERE clause does add to the query efficiency.

Read only

former_member186741
Active Contributor
0 Likes
1,271

I believe that the performance discrepancy between the two versions is caused by you not specifying the joins correctly (i.e. not qualifying the header and item at schedule level as I mentioned previously).

CAse A.

-


Header AS HDR

JOIN Item AS ITM

ON HDRORDID = ITMORDID

AND HDRVRSIOID = ITMVRSIOID

JOIN ScheduleLine AS SDL

ON ITMITMID = SDLITMID

reads all 1000 hDRs, joins all 20000 items correctly to these BUT then joins schedule lines only by item id so there will be a very large number of matches. i.e. any item with itemid 10 will match to EVERY SDL with itemid 10. as there are 20000 items and 200000 schedule lines this will be massive!

Case B:

-


ScheduleLine AS SDL

JOIN Item AS ITM

ON ITMITMID = SDLITMID

JOIN Header AS HDR

ON HDRORDID = ITMORDID

AND HDRVRSIOID = ITMVRSIOID

This will not have the same amount of DB traffic as the SDL is now the driving force. It will still match more than you intend but will only be reading the items redundantly trather than the more voluminous schedules.

................................

Basically, your sql is wrong and should qualify the matches right down to schedule line level as per my earlier post. When you have changed the sql, try the two alternatives again and see if there is any performance differences. I suspect that there will not be, but it will be interesting to find out for sure.

Read only

0 Likes
1,271

Hi Neil,

Actually I had not specified all the where conditions. Below is the complete query.

SELECT SDLORDID SDLITMID FROM

Header AS HDR JOIN Item AS ITM

ON HDRORDID = ITMORDID

AND HDRVRSIOID = ITMVRSIOID

JOIN ScheduleLine AS SDL

ON ITMITMID = SDLITMID

APPENDING CORRESPONDING FIELDS OF TABLE et_ordid

FOR ALL ENTRIES IN it_itmmatlocto

WHERE

itm~vrsioid = iv_vrsioid

AND itm~ortype = iv_ortype

AND itm~matid = it_itmmatlocto-matid

AND itm~locidto = it_itmmatlocto-locidto

AND HDR~PRTIDFR IN IT_HDRPRTFR

AND HDR~PRTIDTO IN IT_HDRPRTTO

AND SDL~ORDNO IN IT_ORDNO0

Here all the internal tables (it_itmmatlocto, IT_HDRPRTFR, IT_HDRPRTTO and IT_ORDNO0) has proper values which will reduce the selected records. Even though th selection is reduced WHERE condition, there is significant difference between Case A and Case B. As said earlier, Case A times out and Case B gets executed immediately.

Read only

0 Likes
1,271

Hi Karthik,

Is your question answered.

If yes, reward points and close the thread.

Read only

0 Likes
1,271

yes, but I think the performance will improve if you specify the JOIN of the schedule lines in more detail.

ON ITMITMID = SDLITMID

should be

ON SDLORDID = ITMORDID

AND SDLVRSIOID = ITMVRSIOID

and ITMITMID = SDLITMID

...

joining only on itmid means there will be thousands of inncorrect matches.

Read only

0 Likes
1,271

If that is the case why is the performance Case B really fast. That is changing the Join Order sequence improves the performance.

If what you say is the reason then in both cases the performance should be really slow.

Read only

0 Likes
1,271

it makes sense that case b is faster than case a.

In case A the db system has to do multiple passes of the most voluminous table (SDL) to find the matches. In case b it is still wrong but will have to do less db passes because SDL is only passed once and the smaller tables are matched to it.

Change your joins and watch the improvement.

Imagine we were not talking about sql but abap loops instead. If we had:

case A.

loop at hdr.

loop at itm.

loop at sdl.

endloop.

endloop.

endloop.

case B.

loop at sdl.

loop at itm.

loop at hdr.

endloop.

endloop.

endloop.

Case A will be a lot slower than Case B and for similar reasons to the sql. CAse B only has to loop through the biggest table once.