Solved: Select Join Performance

Former Member · ‎2011 Feb 08

Hi,

I have a performance problem with this select statement: Takes 30 to 50 secs.

    SELECT k~ebeln  k~lifnr  k~ekgrp  k~ekorg  k~bukrs  k~bsart
           k~bstyp  k~bedat  k~reswk  k~spras  k~adrnr
           k~waers p~ebelp  p~pstyp  p~werks  p~matnr  p~matkl  p~txz01
           p~idnlf  p~labnr
           p~elikz  p~erekz  p~knttp  p~vrtkz  p~meins  p~mfrpn
            p~effwr p~netpr p~peinh p~bprme
           t~etenr  t~eindt  t~menge  t~wemng  t~slfdt  t~mahnz
           INTO CORRESPONDING FIELDS OF TABLE g_i_tab1
           FROM  ekko    AS k
           JOIN  ekpo    AS p
           ON    p~ebeln =  k~ebeln
           JOIN  eket    AS t
           ON    t~ebeln =  k~ebeln  AND
                 t~ebelp =  p~ebelp
           WHERE k~lifnr IN s_lifnr  AND
                 k~ekorg IN s_ekorg  AND
                 k~ekgrp IN s_ekgrp  AND
                 k~bedat IN s_bedat  AND
                 k~ebeln IN s_ebeln  AND
                 k~bstyp IN s_bstyp  AND
                 k~bsart IN s_bsart  AND
                 k~loekz =  space    AND
                 p~matnr IN s_matnr  AND
                 p~werks IN s_werks  AND
                 p~bstyp IN s_bstyp  AND
                 p~loekz =  space    AND
                 p~elikz IN r_elikz  AND
                 p~matkl IN s_matkl  AND
                 p~pstyp IN s_pstyp  AND
                 p~knttp IN s_knttp  AND
                 p~kanba <> 'Y'      AND
                 t~eindt IN s_eindt  AND
                 t~menge >  0        AND
                 t~menge >  t~wemng.

The SQL-Trace (ST05) shows two nested loops, the selection starts always with EKKO

I wonder whether I can force the System to start the selection with Table EKPO, using a certain Index?

(When I count the number of entries with SE16N it takes only 1 sec and I should have the Unique Index to read the other two tables.)

What can I do?

Best regards

Jens

Former Member · ‎2011 Feb 08

Can you indicate your mandatory selection parameters? What is the index that it is currently hitting?

If Vendor or BEDAT is filled, I would assume that your query always go after the respective indexes... If you can always ensure that Vendor, EKORG, EKGRP and BEDAT is populated, your query should be faster or BSTYP and BEDAT would give you good results?

Former Member · ‎2011 Feb 08

Can you indicate your mandatory selection parameters? What is the index that it is currently hitting?

If Vendor or BEDAT is filled, I would assume that your query always go after the respective indexes... If you can always ensure that Vendor, EKORG, EKGRP and BEDAT is populated, your query should be faster or BSTYP and BEDAT would give you good results?

volker_borowski2 · ‎2011 Feb 08

Hi Jens,

Hints are DB specific. What is your DB type ?

For Oracle, I can give some suggestions:

You need to do an SQL trace to catch the exact statement.

Put the statement in ST05 and use the "explain with hint" function to check the functionality.

Try to use as less as possible hints in a statement

For a join, you might want to use "leading(table-alias)" first to check if this is sufficiant.

That might look like "leading(k)" in your case.

If that is not good enough, you can give more hints like


leading(k) index(k,"EKKO~Z01")

suggest a nested loop of table EKPO utilizing the PK:


leading(k) index(k,"EKKO~Z01") use_nl(k,p) index(p,"EKPO~0")

For more hints use google to find what is possible.

To bring it in Open SQL, you need to append

HINTS ORACLE "Hint-Text"

to your ABAP statement. Validate in SQL Trace, that it is executed correctly.

Hope this helps

Volker

Former Member · ‎2011 Feb 08


 WHERE k~lifnr IN s_lifnr  AND
                 k~ekorg IN s_ekorg  AND
                 k~ekgrp IN s_ekgrp  AND
                 k~bedat IN s_bedat  AND
                 k~ebeln IN s_ebeln  AND
                 k~bstyp IN s_bstyp  AND
                 k~bsart IN s_bsart  AND
                 k~loekz =  space    AND
                 p~matnr IN s_matnr  AND
                 p~werks IN s_werks  AND
                 p~bstyp IN s_bstyp  AND
                 p~loekz =  space    AND
                 p~elikz IN r_elikz  AND
                 p~matkl IN s_matkl  AND
                 p~pstyp IN s_pstyp  AND
                 p~knttp IN s_knttp  AND
                 p~kanba  'Y'      AND
                 t~eindt IN s_eindt  AND
                 t~menge >  0        AND
                 t~menge >  t~wemng.

This is never understood in this forum, a SELECT like yours is dynamic ... the ranges are dynamic and usually only a few are filled.

All conditions for k = ekko can be empty or all conditions for p = ekpo can be empty. So the starting table is not fixed.

Of course any other combination of conditions is possible. And not all combinations can be supported.

Usually the combinations, which make sense, are supported by indexes.

So your question should give the DB-statement which shows only the non-empty ranges!

Hints can not help here!

Siegfried

HermannGahm · ‎2011 Feb 08

Hi Jens,

>


> What can I do?

as the others said we need the data from ST05. How does your statement look like in the DB.

Additionally the execution plan and indexes might help as well.

On some databases (e.g. ORACLE and DB6) you can control very detailed how the statement

should be executed. Other databases do not offer such a good control but more generic hints.

On the DB6 plattform we have conditional hints (check out SAP note 1484907). Having said that

i disagree with Siegfried that hints will not help here in general. With the conditional hints on DB6

it would be ok. I hope the conditional hints will follow for other DB plattforms. Without them you

would have to do this "conditional thing" yourself in ABAP which is dynamically possible as well

since a hint is just a string.

However, hinting will help you only on some plattforms. And then only for specific cases and

you have to be very precise and make sure you know what you do.

Gernerally you should find the root cause why your query is slow? Is the table with the smallest

result set processed as the first tabel? If not, why not? If better execution plans are available but

not picked up, get in contact with a DBA. They usually know the possible workarounds. Hinting

is one of them but there are others... .

If you want more help from this forum please post more details... execution plan, indexes, ... .

Kind regards,

Hermann

Former Member · ‎2011 Feb 09

Hi,

thanks for your ansers so far. Hope this aditional information will help to find a good solution:

Mandatory are only the following values:

EKPO-WERKS EKPO-ELIKZ EKET-EINDT

This values are fixed in the select statement:

EKPO-LOEKZ EKPO-KANBA EKET-MENGE EKET-WEMNG

Usually the selection is executed at lease with some of these valuels:

EKKO-BSTYP EKKO-LIFNR EKKO-EKGRP EKKO-EKORG EKPO-MATNR

Database: ORACLE 10.2.0.4.0

ST05 looks like this: (with EKKO-LIFNR -the mainly used selection)

SQL Statement

SELECT

T_00 . "EBELN" , T_00 . "LIFNR" , T_00 . "EKGRP" , T_00 . "EKORG" , T_00 . "BUKRS" ,

T_00 . "BSART" , T_00 . "BSTYP" , T_00 . "BEDAT" , T_00 . "RESWK" , T_00 . "SPRAS" ,

T_00 . "ADRNR" , T_00 . "WAERS" , T_01 . "EBELP" , T_01 . "PSTYP" , T_01 . "WERKS" ,

T_01 . "MATNR" , T_01 . "MATKL" , T_01 . "TXZ01" , T_01 . "IDNLF" , T_01 . "LABNR" ,

T_01 . "ELIKZ" , T_01 . "EREKZ" , T_01 . "KNTTP" , T_01 . "VRTKZ" , T_01 . "MEINS" ,

T_01 . "MFRPN" , T_01 . "EFFWR" , T_01 . "NETPR" , T_01 . "PEINH" , T_01 . "BPRME" ,

T_02 . "ETENR" , T_02 . "EINDT" , T_02 . "MENGE" , T_02 . "WEMNG" , T_02 . "SLFDT" ,

T_02 . "MAHNZ"

FROM

"EKKO" T_00 , "EKPO" T_01 , "EKET" T_02

WHERE

( T_01 . "MANDT" = :A0 AND T_01 . "EBELN" = T_00 . "EBELN" ) AND ( T_02 . "MANDT" = :A1 AND T_02

. "EBELN" = T_00 . "EBELN" AND T_02 . "EBELP" = T_01 . "EBELP" ) AND T_00 . "MANDT" = :A2 AND

T_00 . "LIFNR" = :A3 AND T_00 . "EKORG" = :A4 AND T_00 . "BSTYP" IN ( :A5 , :A6 ) AND T_00 .

"LOEKZ" = :A7 AND T_01 . "WERKS" = :A8 AND T_01 . "BSTYP" IN ( :A9 , :A10 ) AND T_01 . "LOEKZ" =

:A11 AND T_01 . "ELIKZ" = :A12 AND T_01 . "KANBA" <> :A13 AND T_02 . "EINDT" BETWEEN :A14 AND

:A15 AND T_02 . "MENGE" > :A16 AND T_02 . "MENGE" > T_02 . "WEMNG"

Execution Plan

SELECT STATEMENT ( Estimated Costs = 19 , Estimated #Rows = 1 )

9 FILTER

8 NESTED LOOPS

( Estim. Costs = 19 , Estim. #Rows = 1 )

Estim. CPU-Costs = 171.394 Estim. IO-Costs = 19

5 NESTED LOOPS

( Estim. Costs = 15 , Estim. #Rows = 9 )

Estim. CPU-Costs = 135.184 Estim. IO-Costs = 15

2 TABLE ACCESS BY INDEX ROWID EKKO

( Estim. Costs = 8 , Estim. #Rows = 9 )

Estim. CPU-Costs = 71.777 Estim. IO-Costs = 8

1 INDEX RANGE SCAN EKKO~1

( Estim. Costs = 1 , Estim. #Rows = 45 )

Search Columns: 3

Estim. CPU-Costs = 6.113 Estim. IO-Costs = 1

4 TABLE ACCESS BY INDEX ROWID EKET

( Estim. Costs = 1 , Estim. #Rows = 1 )

Estim. CPU-Costs = 7.045 Estim. IO-Costs = 1

3 INDEX RANGE SCAN EKET~0

Search Columns: 2

Estim. CPU-Costs = 3.339 Estim. IO-Costs = 0

7 TABLE ACCESS BY INDEX ROWID EKPO

Estim. CPU-Costs = 4.023 Estim. IO-Costs = 0

6 INDEX UNIQUE SCAN EKPO~0

Search Columns: 3

Estim. CPU-Costs = 1.804 Estim. IO-Costs = 0

Used index: EKKO~1

MANDT

LIFNR

EKORG

EKGRP

BEDAT

Best regards

Jens

Edited by: jeman69 on Feb 9, 2011 10:22 AM

HermannGahm · ‎2011 Feb 09

Hi Jens,

and this selection

WHERE

( T_01 . "MANDT" = :A0 AND T_01 . "EBELN" = T_00 . "EBELN" )

AND ( T_02 . "MANDT" = :A1 AND T_02 . "EBELN" = T_00 . "EBELN" AND T_02 . "EBELP" = T_01 . "EBELP" )

AND T_00 . "MANDT" = :A2

AND T_00 . "LIFNR" = :A3

AND T_00 . "EKORG" = :A4

AND T_00 . "BSTYP" IN ( :A5 , :A6 )

AND T_00 ."LOEKZ" = :A7

AND T_01 . "WERKS" = :A8

AND T_01 . "BSTYP" IN ( :A9 , :A10 )

AND T_01 . "LOEKZ" = :A11

AND T_01 . "ELIKZ" = :A12

AND T_01 . "KANBA" :A13

AND T_02 . "EINDT" BETWEEN :A14 AND :A15

AND T_02 . "MENGE" > :A16

AND T_02 . "MENGE" > T_02 . "WEMNG"

runs 50 seconds?

at first glance the execution plan (starting with lifnr) looks nice for me.

Have you checked if other options? it does not look like ekpo (T_01) would

be a better choice as a starting point in this case.

You can check the result sets for all 3 tables in SE16 with the given selection

criteria. The optimizer should start on the smallest result set. This should be supported

by an index. If LIFNR provides the smalles result set and we have index suport here

the entry looks fine for me.

Kind regards,

Hermann

Former Member · ‎2011 Feb 09

Hi Hermann,

Yes this runs depending on the Number of Purchase Orders and Sheduling Agreements too long.

LIFNR with 12.000 EKKO-Entries: 90s

LIFNR with 4.500 EKKO-Entries: 20s

If I count the number of Entries of EKKO und EKPO using the selection criteria, I get the following runtimes in SE16N:

EKKO with 4500 Entries: 8s (number differs from 0 to 12.000 depending on LIFNR)

EKPO with 6200 Entries: 4s (number is constant per plant)

Total number of entries in the System:

EKKO: 450.000

EKPO: 1.100.000

Best Regards

Jens

volker_borowski2 · ‎2011 Feb 09

Hi,

how does the plan change when you hint it with "leading(T_01)" in ST05 as your first suggestion was?

Are the stats up to date? The initial guess for the "~1" index entry is 35 rows,

which appears quite diffrent to the 4500 you mention as the reality.

How many rows do you get for a typical

AND T_02 . "EINDT" BETWEEN :A14 AND :A15

on table EKET in SE16, and how long does that take?

Volker

HermannGahm · ‎2011 Feb 10

Hi Jens,

> Yes this runs depending on the Number of Purchase Orders and Sheduling Agreements too long.

> LIFNR with 12.000 EKKO-Entries: 90s

> LIFNR with 4.500 EKKO-Entries: 20s

uneqal data distribution

> If I count the number of Entries of EKKO und EKPO using the selection criteria, I get the following runtimes in SE16N:

> EKKO with 4500 Entries: 8s (number differs from 0 to 12.000 depending on LIFNR)

> EKPO with 6200 Entries: 4s (number is constant per plant)

combined with a join.

Challenging. For some LIFNRs the EKKO is the best entry (for those that have less rows than the other table result sets) and

for some EKKO is a suboptimal entry (for those that have more rows than the other table result sets). Theoretically histograms should help here but in this scenario i don't think it is a good option (depends of course).

You can of course choose EKPO (or maybe EKET, see Volkers questions) as the default entry as a tradeoff. If they are rather constant it would be an good option for LIFNRs with a big result set. Only for LIFNRs with small result sets there would be better options (starting with EKKO available). You can make some tests with hints and LIFNRs with big and small volumes choosing EKKO or EKPO as the starting point and compare the run times.

Kind regards,

Hermann

Former Member · ‎2011 Feb 10

Hi Volker,

Last statistiks date is 27.11.2010. (Should be OK)

Hint in ST05 seams to work:

SQL Statement

SELECT

/*+

leading(T_01)

*/

T_00 . "EBELN" , T_00 . "LIFNR" , T_00 . "EKGRP" , T_00 . "EKORG" , T_00 . "BUKRS" ,

T_00 . "BSART" , T_00 . "BSTYP" , T_00 . "BEDAT" , T_00 . "RESWK" , T_00 . "SPRAS" ,

T_00 . "ADRNR" , T_00 . "WAERS" , T_01 . "EBELP" , T_01 . "PSTYP" , T_01 . "WERKS" ,

T_01 . "MATNR" , T_01 . "MATKL" , T_01 . "TXZ01" , T_01 . "IDNLF" , T_01 . "LABNR" ,

T_01 . "ELIKZ" , T_01 . "EREKZ" , T_01 . "KNTTP" , T_01 . "VRTKZ" , T_01 . "MEINS" ,

T_01 . "MFRPN" , T_01 . "EFFWR" , T_01 . "NETPR" , T_01 . "PEINH" , T_01 . "BPRME" ,

T_02 . "ETENR" , T_02 . "EINDT" , T_02 . "MENGE" , T_02 . "WEMNG" , T_02 . "SLFDT" ,

T_02 . "MAHNZ"

FROM

"EKKO" T_00 , "EKPO" T_01 , "EKET" T_02

WHERE

( T_01 . "MANDT" = :A0 AND T_01 . "EBELN" = T_00 . "EBELN" ) AND ( T_02 . "MANDT" = :A1 AND T_02

. "EBELN" = T_00 . "EBELN" AND T_02 . "EBELP" = T_01 . "EBELP" ) AND T_00 . "MANDT" = :A2 AND

T_00 . "LIFNR" = :A3 AND T_00 . "EKORG" = :A4 AND T_00 . "BSTYP" IN ( :A5 , :A6 ) AND T_00 .

"LOEKZ" = :A7 AND T_01 . "WERKS" = :A8 AND T_01 . "BSTYP" IN ( :A9 , :A10 ) AND T_01 . "LOEKZ" =

:A11 AND T_01 . "ELIKZ" = :A12 AND T_01 . "KANBA" <> :A13 AND T_02 . "EINDT" BETWEEN :A14 AND

:A15 AND T_02 . "MENGE" > :A16 AND T_02 . "MENGE" > T_02 . "WEMNG"

Execution Plan

SELECT STATEMENT ( Estimated Costs = 4.445 , Estimated #Rows = 1 )

9 FILTER

8 TABLE ACCESS BY INDEX ROWID EKET

( Estim. Costs = 1 , Estim. #Rows = 1 )

Estim. CPU-Costs = 4.591 Estim. IO-Costs = 1

7 NESTED LOOPS

( Estim. Costs = 4.444 , Estim. #Rows = 1 )

Estim. CPU-Costs = 88.575.088 Estim. IO-Costs = 4.432

5 HASH JOIN

( Estim. Costs = 4.439 , Estim. #Rows = 9 )

Estim. CPU-Costs = 88.533.765 Estim. IO-Costs = 4.427

2 TABLE ACCESS BY INDEX ROWID EKPO

( Estim. Costs = 4.430 , Estim. #Rows = 4.672 )

Estim. CPU-Costs = 83.959.017 Estim. IO-Costs = 4.419

1 INDEX SKIP SCAN EKPO~1

( Estim. Costs = 1.983 , Estim. #Rows = 14.015 )

Search Columns: 4

Estim. CPU-Costs = 55.287.075 Estim. IO-Costs = 1.975

4 TABLE ACCESS BY INDEX ROWID EKKO

( Estim. Costs = 8 , Estim. #Rows = 9 )

Estim. CPU-Costs = 71.777 Estim. IO-Costs = 8

3 INDEX RANGE SCAN EKKO~1

( Estim. Costs = 1 , Estim. #Rows = 45 )

Search Columns: 3

Estim. CPU-Costs = 6.113 Estim. IO-Costs = 1

6 INDEX RANGE SCAN EKET~0

Search Columns: 3

Estim. CPU-Costs = 3.059 Estim. IO-Costs = 0

Although the estimates cost are very high I would like to test it.

How is the ABAP-Syntax for the Hint of the Select-Statement?

How do I get comparable results, since a second execution is always a lot faster ?

(Do I have to invalidate the Buffer for the tables? If so, how do I do this?)

Sorry for having so many questions.

typical T_02 . "EINDT" BETWEEN :A14 AND :A15

About 50.000 Entries, takes 14secs

Best regards

Jens

HermannGahm · ‎2011 Feb 10

Hi Jens,

> Although the estimates cost are very high I would like to test it.
> How is the ABAP-Syntax for the Hint of the Select-Statement?
> How do I get comparable results, since a second execution is always a lot faster ? 
> (Do I have to invalidate the Buffer for the tables? If so, how do I do this?)
> Sorry for having so many questions.

in ABAP you need:

SELECT k_{ebeln k}lifnr k_{ekgrp k}ekorg k_{bukrs k}bsart

... t_{etenr t}eindt t_{menge t}wemng t_{slfdt t}mahnz

INTO CORRESPONDING FIELDS OF TABLE g_i_tab1

FROM ekko AS k

JOIN ekpo AS p

ON p_{ebeln = k}ebeln

JOIN eket AS t

ON t_{ebeln = k}ebeln AND

t_{ebelp = p}ebelp

WHERE k~lifnr IN s_lifnr AND

k~ekorg IN s_ekorg AND

... t~menge > 0 AND

t_{menge > t}wemng

%_hints ORACLE 'LEADING "EKPO"'.

(or %_hints ORACLE 'LEADING "T_01"'.)

execute the queries multiple times and compare the buffergets per execution in ST04. In ORACLE you can reset the sql cache kpis after each test set (multiple executions) and analyze the delta since reset.

You can try with and without hint for big and small LIFNRs.

Kind regards,

Hemrann

volker_borowski2 · ‎2011 Feb 10

Hi,

I do not think the skip scan entry for this plan will lead to anything.

So you have

EKKO with 4500 Entries: 8s (number differs from 0 to 12.000 depending on LIFNR)

EKPO with 6200 Entries: 4s (number is constant per plant)

EKET About 50.000 Entries, takes 14secs

this is 26 for counting in total

... and a join delivering real fields takes 30 to 50 Secs

This does not sound all too bad taking the above 3 values into account

You did not say yet, how many rows the result has.

So to see, how many rows are eliminated in the join

Depending on that, one could try to change the join strategy.

One could try to hint the join from "nested loop" to a "sort merge" or a "hash", just

to measure, if this gets any results, but I would do this in sqlplus with autotrace on

to get the additonal block statistic for the fetch.

But I do not think, this will give you better results as well.

Volker

Former Member · ‎2011 Feb 11

Hi,

wrote a test-programm containing only the select-statement with and without hint.

 %_hints ORACLE 'leading(T_00) index(T_00,"EKPO~1")'.

Hint is working, as I can see in ST05.

Performance without hint is better than with hint.(2-3 times)

Executed the the Report a number of times (with hint):

First execution: 60sec

Following executions: 0.6sec

Executed the the Report a number of times (without hint):

First execution: 30sec

Following executions: 0.2sec

ST04 shows, that in the following executions data is read from database data buffer. (No physical reads)

If the data were alway be in the database buffer the runtime would be perfect.

How can I do this, or is this against all principals?

I do not have direct acess to the ORACLE database. (Only via SAP)

Can I delete the database cache so that I can retest the excution time for the first execution?

Side effects?

Thypically the select results in 10 to 100 rows.

One more Question:

In the development system ST05 - Execution plan contains a HASH JOIN.

In the integration system ST05 - Execution plan contains a NESTED LOOPS instead.

What does the ABAP-Syntax look like to hint the Database to perform NESTED LOOPS.

Best regards

Jens Grimmelmann

HermannGahm · ‎2011 Feb 11

Hi Jens,

> Performance without hint is better than with hint.(2-3 times)

as Volker said the original plan starting with LIFNR is probably a quite good one

> If the data were alway be in the database buffer the runtime would be perfect.

> How can I do this, or is this against all principals?

i would not do it. We see that customers do this from time to time (on ORACLE you

have to create a so called KEEP POOL and put your table there, in case you are

interested just google with key words ORACLE KEEP POOL). It would mean

to place all table/index blocks in a dedicated pool with enough space so that

there is a high chance that no displacement will take place. It's a rather drastic

measure. You put away a siginificant part of the db cahe to support a certain

set of sql statements that could mean all other statements may suffer because

they have less cache available. Our recommendation typically is to stic to the

default pool and let ORACLE do the discplacement based on the touch count

in order to achieve a good performance for the whole system.

> I do not have direct acess to the ORACLE database. (Only via SAP)

> Can I delete the database cache so that I can retest the excution time for the first execution?

alter system flush db cache or something like that.... i would use it only

for test purposes...

> Side effects?

nothing is in the cache... like after restarting the database... performance will be

bad for the whole system till the cache is filled again...

> One more Question:

> In the development system ST05 - Execution plan contains a HASH JOIN.

> In the integration system ST05 - Execution plan contains a NESTED LOOPS instead.

> What does the ABAP-Syntax look like to hint the Database to perform NESTED LOOPS.

The appropriate hint would be e.g.


%_hints ORACLE 'leading(T_00) index(T_00,"EKPO~1") use_nl(T_01)'.

in this example you specified to start with t_00 on index ekpo~1 and then use a

nested loop join (use_nl) to approach table t_01 (oracle decides the access path for

this table). you could continue with use_nl(T_02) to specify how the third table

should be joined. use_hash forces oracle to use a hash join if technically possible.

How much I/O is your query doing (buffer gets per exec)? If nothing is in cache

you can expect that each buffer get will be a disk read lasting for 6 - 15 ms (depending

on the i/O system) so you can calculate the expected time. Minimizing the i/O is the

only solution... if the i/O is already at a minimum... well...

Kind regards,

Hermann

volker_borowski2 · ‎2011 Feb 11


> The appropriate hint would be e.g.
> 
> 
> %_hints ORACLE 'leading(T_00) index(T_00,"EKPO~1") use_nl(T_01)'. 
> 
>

Well the problem is, that you have only "leading" and "ordered" as hints for join order.

so even with that hint, it only suggests to start with EKKO and to use a nested loop

to EKPO, but it will not say WHEN, so it might be that with this hint you'll get right the original plan,

joining EKET at second position.

If you want to make sure, that EKPO is joined on position 2, you need to specify the sequence

in the FROM clause and use the "ordered" hint.


%_hints ORACLE 'ordered'.

(With "ordered" meaning to access the tables in the sequence they follow in FROM)

I'd not specify more at first (crosscheck in ST05), right because of the difference you notice

in hash vs. nested_loop. It might be, that for one system hash is better. But if you want to do it anyway:


2. try: %_hints ORACLE 'ordered use_nl(T_00,T_01)'. 
3. try: %_hints ORACLE 'ordered index(T_00,"EKKO~1") use_nl(T_00,T_01)'. 
4. try: %_hints ORACLE 'ordered index(T_00,"EKKO~1") use_nl(T_00,T_01) index(T_01,"EKPO~0")'.

But I think you will not get away without indexing EKET on EINDAT in addition to the join fields.

Just out of curiosity: how many rows full fill this?

T_02 . "MENGE" > T_02 . "WEMNG"

vs.

T_02 . "MENGE" = T_02 . "WEMNG"

vs.

T_02 . "MENGE" < T_02 . "WEMNG"

If this is the condition, that cuts down the rows, one can possibly try to stick that into a functionbased index that

allows it to query the already executed comparison instead of retrieving the rows nd do the comparrison every time

it is called. (Allthough I can not say how this access would have to be translated to ABAP for access).

Is that your code in a customer program or is that a part of standard coding?

Volker

Former Member · ‎2011 Feb 11

Hi,

Data buffer reads per execution (ST04):

First Example:(High number of Entries for EKKO-LIFNR)

26.000 (with hint)(BD is doing HASH JOIN)

35.000 (with hint incl. Nested Loops)

50.000 (without hint)

Second Example: (Medium number of Entries for EKKO-LIFNR)

21.000 (with hint) (BD is doing HASH JOIN)

34.000 (with hint incl. Nested Loops)

26.000 (without hint)

Third Example: (Very low number of Entries for EKKO-LIFNR)

13.000 (with hint) (BD is doing HASH JOIN)

31.000 (with hint incl. Nested Loops)

500 (without hint)

Conclusion:

To force the Database to do a Nested Loop insead of a Hash join is a bad Idea.

To force the database to start the selection with EKPO only in certaim cases sensible.

Seams that there is no solution.

Best regards

Jens

HermannGahm · ‎2011 Feb 11

Hi Volker,

> Well the problem is, that you have only "leading" and "ordered" as hints for join order.

> so even with that hint, it only suggests to start with EKKO and to use a nested loop

> to EKPO, but it will not say WHEN, so it might be that with this hint you'll get right the original plan,

> joining EKET at second position.

are you sure about that?

My understanding so far was that it should work. But I'm not too sure i must admit.

Looking at things like this:

http://jonathanlewis.wordpress.com/2007/01/16/full-hinting/

i got the impression it should work this way.

> If you want to make sure, that EKPO is joined on position 2, you need to specify the sequence

> in the FROM clause and use the "ordered" hint.

agreed. "ordered" nails down the execution order for sure.

Kind regards,

Hermann

p.s.: i really like your engagement here, it's always interesting and enriching reading from you.

HermannGahm · ‎2011 Feb 11

Hi Jens,

and the runtimes do colerate with the buffer gets?

Well the main problem is the unequal data distribution. Histograms were designed to help here.

But histograms in OLTP systems are... 'difficult' since you could produce many different sql

queries easily (one for each LIFNR)... . So i doubt it would be a good idea.

I think you have to find out a (one) plan that fits all more or less and hint the optimizer to execute it.

Looking at your figures: Hash join + hint is in all cases the 2nd best option regarding I/O.

Kind regards,

Hermann

volker_borowski2 · ‎2011 Feb 11

Hi Hermann,

no, I am not sure. I never did it diffrent becasue of this

http://download.oracle.com/docs/cd/B19306_01/server.102/b14211/hintsref.htm

16.1.2.4 Hints for Join Orders

16.1.2.5 Hints for Join Operations

JL is pretty good, so I guess he did a good research on that.

Volker

Former Member · ‎2011 Feb 11

Hi,


Just out of curiosity: how many rows full fill this?
T_02 . "MENGE" > T_02 . "WEMNG"
vs.
T_02 . "MENGE" = T_02 . "WEMNG"
vs.
T_02 . "MENGE" < T_02 . "WEMNG"

Total: 719.000 ( filterd a old not relvant entries)

MENGE > WEMNG : 102.000

MENGE = WEMNG: 566.000

MENGE < WEMNG : 51.000


Is that your code in a customer program or is that a part of standard coding?

The code is part of a customer program .

For Clarification:

I changed the order of From to EKPO, EKKO, EKET within slect statement.

The displayed HINT has forced the database to start the selection with EKPO~1.

The hint ist working as intended but not improving the performance.


         FROM  ekpo    AS p
         JOIN  ekko    AS k
         ON    p~ebeln =  k~ebeln
         JOIN  eket    AS t
         ON    t~ebeln =  p~ebeln  AND
               t~ebelp =  p~ebelp

Best regards

Jens

volker_borowski2 · ‎2011 Feb 11


> Total: 719.000 ( filterd a old not relvant entries)
> MENGE > WEMNG : 102.000
> MENGE = WEMNG: 566.000
> MENGE < WEMNG : 51.000

So this brings your selection down to 14% of the selected EKET rows..

Wow, that is something to play with.

I will try to puzzle on that a bit on the weekend.

Are you allowed to create additional indexes?

t~menge > 0

??? can "menge" be negative at all in this case ?

Volker

Former Member · ‎2011 Feb 11

Hi Volker,

Why do you think to start with EKET is a goog idea?

When EKKO-LIFNR ist reducing the number ofrelevant rows in EKKO from 450.000 to 4.500. (1%)

And when EKPO-ELIKZ and EKPO-WERKS are reducing the number of relvant rows in EKPO from 1.100.000 to 6.200 (0,5%)


Are you allowed to create additional indexes?

For very good reasons yes.


t~menge > 0
??? can "menge" be negative at all in this case ?

MENGE can never be negative. But MENGE can be < WEMNG. (e.g. In case a supplier delivers more than ordered)

Best regards

Jens

Former Member · ‎2011 Feb 11

Do not know your exact requirement, Is it possible to make 'BEDAT' required along with Vendor number? may be it will get you better results as long as they are giving a small range?

volker_borowski2 · ‎2011 Feb 11


> Hi Volker,
> 
> Why do you think to start with EKET is a goog idea?

HI Jens,

I did not say that that I'd like to start with EKET.

I'd like to save the get of quite a lot datablocks from EKET in the join, just to find out they are usesles

because of this condition. To handle that condition already from indexfields would help,

but even better would be if one could create some type of function that already retrieves a

precalculated value from a function based index.

I never did that with boolean stuff jet and standard syntax does not allow that, but there might be

some way to work around that. I like thes typ of puzzles, but they are time-consuming,

so I can not do that now (sorry).

Volker

Former Member · ‎2011 Feb 11

Hi Kris,

you are right BEDAT helps a lot.

If it were only für purchase orders it would work.

But I need to consider Scheduling Agreements too, old BEDAT are still relevant.

Best regards

Jens Grimmelmann

Former Member · ‎2011 Feb 11


> you are right BEDAT helps a lot.
> If it were only für purchase orders it would work.
> But I need to consider Scheduling Agreements too, old BEDAT are still relevant.

Hi Jen,

How many scheduling agreements do you have in your system? I would guess probably not a whole lot...

Is there any scope to run this report by individual document category(BSTYP), so that you can execute it separately for scheduling agreements with a bigger range of bedat? If the users are willing to enter a smaller range for Purchase orders (F), there are different ways (including the option to have two dates on the selection screen, one for purchase orders/scheduling agreements?) Not neat, but will improve your performance?

volker_borowski2 · ‎2011 Feb 11

Hi,

could not resist ...

Bringing the " > " comparison into a difference, putting the difference into a FBI and doing a query where difference > 0.


Using SCOTT schema
SQL> create table test as 
     ( select rownum as ID, a.sal as menge, b.sal as m2  
       from emp a, emp b);
Table created.

SQL> analyze table test compute statistics;
Table analyzed.

SQL> select ID, MENGE, M2, MENGE-M2 from TEST;
:
:


SQL> create index test_ix on test ( MENGE - M2 );
Index created.

SQL> select ID, MENGE, M2 , ( MENGE - M2 ) from TEST where (MENGE-M2) > 0;

        ID      MENGE         M2 (MENGE-M2)
---------- ---------- ---------- ----------
       102       3000       2975         25
       172       3000       2975         25
       185       1300       1250         50
       187       1300       1250         50

---------------------------------------------------------------------------------------
| Id  | Operation                   | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |         |    10 |   130 |     2   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| TEST    |    10 |   130 |     2   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | TEST_IX |     2 |       |     1   (0)| 00:00:01 |
---------------------------------------------------------------------------------------

So at least it can be done. But I do not know how in ABAP

So instead of getting each datablock for real, access both fields to compare them, compute the compare

and discard if no match, you can ACCESS only those, that already fullfill the condition.

May be you need to switch to EXEC SQL

Volker

Former Member · ‎2011 Feb 14

Hi Herman,

I don't realy know whether runtimes correlate with buffer gets.

But the data are strongly indicating, that runtime is defined almost solely by the number of physical reads.

Whether some, many or all data are in database cache available, seems to be of my control.

First test: EKPO_{1 -> EKKO}1 (HASH JOIN) -> EKET~0 (NESTED LOOPS)

1. run: 58 sec (physical reads ???)

2. run: 1.1 sec (physical reads ???)

3. run 0.5 sec (physical reads ???)

4. run 0.5 sec (physical reads ???)

5. run 27 sec (physical reads 10,000) (some time later)

Second test: EKPO_{1 -> EKKO}1 (NESTED LOOPS) -> EKET~0 (NESTED LOOPS)

1. run: 65 sec (physical reads 16,000)

2. run: 0.5 sec (physical reads 0)

3. run 0.5 sec (phyiscal reads 0)

Third test: EKKO_{1 -> EKPO}0 (NESTED LOOPS) -> EKET~0 (NESTED LOOPS)

1. run 69 sec (physical reads 19,000)

2 .run 0.7 sec (physical reads 0)

3. run 0.7 sec (physical reads 0)

Best regards

Jens

Former Member · ‎2011 Feb 14

Neil wrote:


I don't know if anyone has already suggested something like this but it might be worth trying to split the select
like the following to get the key fields.

Is not improving the performance.

Test:

1. run 56 sec (physical reads 18,000) (Selection EKPO: 33 sec / Selection EKPO-EKKO-EKET 23 sec)

2 .run 0.8 sec (physical reads 0)

Volker wrote:


Bringing the " > " comparison into a difference, putting the difference into a FBI and doing a query where difference > 0.
Using SCOTT schema
...

Sorry that is out of my capabilities. Can't verify it.

Kris wrote:


How many scheduling agreements do you have in your system? I would guess probably not a whole lot...
Is there any scope to run this report by individual document category(BSTYP), so that you can execute it separately for scheduling agreements with a bigger range of bedat? If the users are willing to enter a smaller range for Purchase orders (F), there are different ways (including the option to have two dates on the selection screen, one for purchase orders/scheduling agreements?) Not neat, but will improve your performance?

There are 6,000 Schedulling agreemt positions in the System.

Since there seems to be no real solution to reduce the runtime to an aceptable level ( <= 5 sec)

I will follow your idea to select schedulling agreements and purchase orders separately. (training for 30 users required)

For the selection of purchase orders the runtime is considerably reduced through EKKO-BEDAT. (e.g. 9 month out of 7 years)

The selection of schedulling agreementes with or without purchase orders will take some time. (can't help it)

I will close my question tomorrow.

Thanks to all participantes for your help.

Best regards

Jens

volker_borowski2 · ‎2011 Feb 14

Hi Jens,

when you have the EXPLAIN in ST05, you can click on each table and then on the button

"index statistics" and you will get some information about how big your objects are.

If the tables are not all to big overall, you might consider setting the cache attribute

on some objects. (alter table ... / alter index ... )

For tables the relevant keyfield is "blocks", for indexes it is "leaf blocks".

If the tables are very big, you should not do it this way, because your system might suffer,

because the buffer cache will be smaller for the rest of the system.

You can then check, if it gives you benefit to set up a keep pool as Hermann suggested.

Volker

Former Member · ‎2011 Feb 14

Hi Volker,

I don't get any clues out of the data:

If the corresponding data of EKKO and EKET are of any help, I am happy to provide them.

Table EKPO

Last statistics date 09.11.2010

Analyze Method Sample 100.910 Rows

Number of rows 1.009.100

Number of blocks allocated 100.206

Number of empty blocks 484

Average space 1.091

Chain count 0

Average row length 686

Partitioned NO

UNIQUE Index EKPO~0

Column Name #Distinct

MANDT 1

EBELN 161.003

EBELP 728

Last statistics date 09.11.2010

Analyze Method Sample 223.623 Rows

Levels of B-Tree 2

Number of leaf blocks 5.172

Number of distinct keys 1.006.814

Average leaf blocks per key 1

Average data blocks per key 1

Clustering factor 195.332

NONUNIQUE Index EKPO~1

Column Name #Distinct

MANDT 1

MATNR 55.811

WERKS 9

BSTYP 4

LOEKZ 4

ELIKZ 2

MATKL 636

Last statistics date 09.11.2010

Analyze Method Sample 119.150 Rows

Levels of B-Tree 2

Number of leaf blocks 9.876

Number of distinct keys 217.916

Average leaf blocks per key 1

Average data blocks per key 4

Clustering factor 879.714

I didn't get the idea with the keep pool, could you comment on it in more detail?

Best regards

Jens

volker_borowski2 · ‎2011 Feb 14


Table   EKPO
> Number of rows                         1.009.100
> Number of blocks allocated               100.206
UNIQUE     Index   EKPO~0
> Levels of B-Tree                               2
> Number of leaf blocks                      5.172
NONUNIQUE  Index   EKPO~1
> Levels of B-Tree                               2
> Number of leaf blocks                      9.876
> I didn't get the idea with the keep pool, could you comment on it in more detail?

Hi Jens,

this simply means, you EKPO has a size of 100206 Block so it is about ~ 800MB in size.

As you observed significant better performance, when the data is in the buffer-cache, there are two ways to achieve this.

a) You can set a cache attribute for the object. This is usefull when the object is small (say 100-200 blocks).

Oracle will keep those objects longer in the cache as they would normally pass out from the

cache on some kind of LRU method. Now with a 800MB Object, this is not a really good option and even the

indexes are to big for my feeling (I was hoping for a few hundred on the indexes).

b) You can set up additional buffer caches. There is the standard cache, which is configured with db_cache_size.

But there are two more, the can be configured, allthough they are not used in general.

The KEEP and the RECYCLE pool.

Once setting up db_keep_size, you will create a second buffer cache of the desired size.

The keep pool will contain no objects, but those that are assigned to ist.

So you can execute

alter table sapsr3.ekpo storage ( buffer_pool KEEP );

and this will instruct the DB to use the keep pool for this table. (Same assignment for indexes needed)

So it is pretty obvious, if this is the only table assigned to this cache, no other object will remove it from the KEEP pool.

The problem is, it is difficult to size it.

Lets see, all the three EKPO objects have 115000 Blocks.

We need EKET and EKKO in addition plus the indexes involved.

As you stated, you merely work on most recent data, you would say once you summed it all up, that you would need

10% (or 20% or 30%) of the overall size for the keep pool to achieve a good performance.

It depends on DB RAM what you can give to this pool.

If the result is ok, you will no longer have the problem that OTHER transactions will throw your valuable blocks

out of the keep pool. You can only do so with objects that do belong to the keep pool. i.E a call of SE16 wich will

read old data will read this data through the KEEP pool as well. This might throw out your interesting blocks and cause

a re-read upon next execution.

Hope this describes the assets.

If your want a bit more background, please do a google with "oracle keep pool".

Volker

former_member186741 · ‎2011 Feb 14

Hi Jens,

out of interest when you tried my approach did you sort the table entries before using them in the FAE clause? I believe that peformance can be slowed if the source FAE table is not in the same order as the select.

Cheers

Neil

Former Member · ‎2011 Feb 15

Neil wrote:


out of interest when you tried my approach did you sort the table entries before using them in the FAE clause? I believe that peformance can be slowed if the source FAE table is not in the same order as the select.

This is what I did: (I Think it's what you suggested)


  TYPES:
  BEGIN OF ekpo_keys,
   ebeln TYPE ekpo-ebeln,
   ebelp TYPE ekpo-ebelp,
  END OF ekpo_keys.

  DATA: g_ekpo_keys TYPE STANDARD TABLE OF  ekpo_keys.

* First selection (EKPO)

SORT g_ekpo_keys BY ebeln ebelp.

* Second selection (EKPO-EKKO-EKET)

Best regards

Jens

Former Member · ‎2011 Feb 15

Selects with inner join are messy and usually not reccomended beyon 2 tables. Break it up into 3 hashed or sorted tables. you do not necessarily have to combine them into 1 table unless you need it for ALV report or file output. You will probably have a main look say at EKKO and then reads at other tables to get the item data. Everything else will work in some cases and won't in other cases. Inner join will cause nested loops as this is the only way it could gather the information you are asking for.

former_member186741 · ‎2011 Feb 15

Hi Jens, could you please post your latest two selects in full. It will be interesting to see the current version.

Cheers, Neil

Former Member · ‎2011 Feb 16

Neil wrote:


Hi Jens, could you please post your latest two selects in full. It will be interesting to see the current version.
Cheers, Neil

Copied your suggested coding.

first select:


  SELECT ebeln ebelp
  INTO CORRESPONDING FIELDS OF TABLE g_ekpo_keys "sorted on ebeln ebelp
  FROM ekpo
  WHERE matnr IN s_matnr AND
  werks IN s_werks AND
  bstyp IN s_bstyp AND
  loekz = space AND
  elikz IN r_elikz AND
  matkl IN s_matkl AND
  pstyp IN s_pstyp AND
  knttp IN s_knttp AND
  kanba <> 'Y' .

second select:


    SELECT k~ebeln k~lifnr k~ekgrp k~ekorg k~bukrs k~bsart
    k~bstyp k~bedat k~reswk k~spras k~adrnr
    k~waers p~ebelp p~pstyp p~werks p~matnr p~matkl p~txz01
    p~idnlf p~labnr
    p~elikz p~erekz p~knttp p~vrtkz p~meins p~mfrpn
    p~effwr p~netpr p~peinh p~bprme
    t~etenr t~eindt t~menge t~wemng t~slfdt t~mahnz
    INTO CORRESPONDING FIELDS OF TABLE g_i_tab1
    FROM ekpo AS p
    JOIN ekko AS k
    ON k~ebeln = p~ebeln
    JOIN eket AS t
    ON    t~ebeln = p~ebeln
      AND t~ebelp = p~ebelp
    FOR ALL ENTRIES IN g_ekpo_keys
    WHERE
    p~ebeln = g_ekpo_keys-ebeln AND
    p~ebelp = g_ekpo_keys-ebelp AND
    k~lifnr IN s_lifnr AND
    k~ekorg IN s_ekorg AND
    k~ekgrp IN s_ekgrp AND
    k~bedat IN s_bedat AND
    k~ebeln IN s_ebeln AND
    k~bstyp IN s_bstyp AND
    k~bsart IN s_bsart AND
    k~loekz = space AND
    t~eindt IN s_eindt AND
    t~menge > 0 AND
    t~menge > t~wemng.

Best regards

Jens

Former Member · ‎2011 Feb 12

.... much too long ...

can someone summarize

HermannGahm · ‎2011 Feb 13

Hi Siegfried,

the data is not equally distributed. Based on the concrete value of LIFNR for some LIFNRs (small resultset) EKKO is the best entry in the join for other LIFNRs (big resultset) EKPO would be a better entry in the join. Histograms are probably no good idea (too many different satements).

Caching (keeping the required data in the cache) was dicussed as well. This would require a signinifcant part of the cach to support queries for these tables.

The pyhsical I/O generated in the various scenarios on EKKO or EKPO depending on LIFNR values is in my opinion not easy to control / optmize. We discussed possible solutions like caching or histograms.

Volker nicely demonstrated that there is as well tuning potential on eket it we

"Bringing the " > " comparison into a difference, putting the difference into a function based indexes and doing a query where difference > 0." But since selects on function based indexes ("select ID, MENGE, M2 , ( MENGE - M2 ) " are not supported on OPEN SQL it would only work for EXEC SQL or ADBC. ORACLE 11 offers virtual columns ... that could work with open sql (never tried so far).

Hope this summarizes the main topics. Any comments or corrections are appreciated.

Kind regards,

Hermann

By Category

Related Content

Activity Groups

Industry Groups

Influence and Feedback Groups

Interest Groups

Location Groups

Customer Only Groups

Forums

Related Resources

Products

Learning and Support

About

My SAP Profile

My SAP Profile

Select Join Performance