ST05

Former Member · ‎2009 Apr 30

Hi Expert,

Got a few questions here.

1. What does access predicates and filter predicates in ST05/Performance trace means? Does it influence anything when creating an index?

2. When you create an index, does the order of the column/field matters?

Thanks in advance.

HermannGahm · ‎2009 Apr 30

Hi Eida,

> What does access predicates and filter predicates in ST05/Performance trace means?

this information in an (ORACLE, since db specific) execution plan tells you what columns

of your where condition could be used to restrict the number of index pages that need to be

read (access predicates) and what columns of your where condition could be used for filtering

after reading index or table pages (filter predicates). Since the access predicates restrict the

number of pages to be read they minimize i/o. The filter pages do not restict the number of

pages to be read (not minimizing i/o) but they do restrict the amount of data that is sent

from the db to the application server where your ABAP program runs.

> Does it influence anything when creating an index?

Hm... what do you mean by this question? The access and filter predicates just show you

how your where condition is applied on index and/or table blocks... only for existing indexes

and tables only, of course.

> When you create an index, does the order of the column/field matters?

Yes it does. In general put these cloumns that are used with EQUAL (=) or IN ( which could

be seen as a concatenated or iterated equal / = ) operator, at the beginning of an index.

The classic phone book example:

An index is created on LASTNAME, FIRSTNAME, STREET.

This indexes can support the following where conditions:

where lastname = ? and firstname = ?

where lastname = ?

where lastname = ? and street = ?

where lastname = ? and firstname =? and street = ?

(and others with ranges, between, like, ......)

If you would have created the index in a different order:

e.g. FIRSTNAME, LASTNAME, STREET

these 2 where conditions:

where lastname = ?

where lastname = ? and street = ?

would NOT be supported by this index since they are missing

the fistname which is the first field in the street.

So for your frequently executed sql statements put those fields in

the beginning of the index which are used regular with equal (=) or IN condition

(and which are slective of course).

Hope this helps,

Hermann

MarcinPciak · ‎2009 Apr 30

Hi,

As for the second question: yes the order of columns does matter both during creation and usage.

Regards

Marcin

Former Member · ‎2009 May 05

Order does matter and that's why we maintain order in WHERE clause.

ThomasZloch · ‎2009 May 05

>


> Order does matter and that's why we maintain order in WHERE clause.

Please don't mix things up. Order of fields does matter when an index is designed and created (see Hermann's answer), order of the fields in the WHERE-clause does not matter reg. index usage, it might matter reg. program readability, but that's a different story.

Thomas

HermannGahm · ‎2009 May 05

Hi Thomas,

readability is another reason why we should use one order in where clause.

as well as different cursors in statement caches (for different sql strings)

as well as behaviour in st05 regarding identical selects ... all of them are reasons

for trying to use one specific order (e.g. like defined in ddic) of fields in where clause.

cursor cache:

The attached 3 sql statements are 3 different statements regarding their strings.

3 times prepare in st05. So they will be processed in 3 different cursors.

The db optimizer will come up with the SAME execution plan for all 3 statements

and use CPU time for 3 times parsing.

These execution plans will allocate space for 3 statements in the cursor cache

and so on.

st05:

Since the attached 3 sql statements are technically different ST05 shows no

identical selects.

If we would have used only one order for the field list we would have saved some

ressources on db and we would have seen that the selects are identical in ST05...

and yes... the ABAP program would be nicer regarding readability.

Kind regards,

Hermann



  select msgnr arbgb sprsl
  from t100 BYPASSING BUFFER
  into CORRESPONDING FIELDS OF t100
  where
  msgnr = '010' and
  arbgb = 'something' and
  sprsl = 'DE'.

  ENDSELECT.

  select msgnr arbgb sprsl
  from t100 BYPASSING BUFFER
  into CORRESPONDING FIELDS OF t100
  where
  arbgb = 'something' and
  msgnr = '010' and
  sprsl = 'DE'.

  ENDSELECT.

    select msgnr arbgb sprsl
  from t100 BYPASSING BUFFER
  into CORRESPONDING FIELDS OF t100
  where
  sprsl = 'DE' and
  msgnr = '010' and
  arbgb = 'something'.

  ENDSELECT.


      499 T100       PREPARE            0 SELECT WHERE "MSGNR" = :A0 AND "ARBGB" = :A1 AND "SPRSL" = :A2
        5 T100       OPEN               0 SELECT WHERE "MSGNR" = '010' AND "ARBGB" = 'something' AND "SPRSL" = 'D'
   10.638 T100       FETCH       0   1403
      448 T100       PREPARE            0 SELECT WHERE "ARBGB" = :A0 AND "MSGNR" = :A1 AND "SPRSL" = :A2
        5 T100       OPEN               0 SELECT WHERE "ARBGB" = 'something' AND "MSGNR" = '010' AND "SPRSL" = 'D'
    1.594 T100       FETCH       0   1403
      462 T100       PREPARE            0 SELECT WHERE "SPRSL" = :A0 AND "MSGNR" = :A1 AND "ARBGB" = :A2
        5 T100       OPEN               0 SELECT WHERE "SPRSL" = 'D' AND "MSGNR" = '010' AND "ARBGB" = 'something'
    1.793 T100       FETCH       0   1403

ThomasZloch · ‎2009 May 05

I understand the argument about the unnecessary parsing of SQL statements, but I would like to know, what effect does this have on the overall performance of a typical SAP installation? I assume much less than 1%? (comparing systems where this coding rule is being followed and those where it's not)

Thomas

HermannGahm · ‎2009 May 05

Hi Thomas,

i think it it could be neglected, since repeated executions will reuse the cursors.

Todays SQL statement caches are rather big as well. But it is not so long ago

when they were in the range of Megabytes (not Gigabytes like today).

When there is a low hitratio on the statement cache (due to a small size) and or

a high number of different sql statements it maybe leads to a little overhead

on cpu usage due to parsing. And if we have to regularly prepare the execution plan

again and again (due to low hitratio in the cache) it would affect also execution

time of the statement. This would be a rather rare case i think.

So far (last 10 years), i never came across that this "issue" played an important or

significant role... . I think it is nice to have... as long as we don't have really small statenent

caches and low hitratios on them.

Of course I'm only talking about regular OPEN SQL

which is not using HINTS like &SUBSTITUTE VALUES& in a LOOP or something

Kind regards,

Hermann

Edited by: Hermann Gahm on May 5, 2009 6:24 PM

ThomasZloch · ‎2009 May 05

Hermann, many thanks for these clarifications.

Another example where a simple question turns into an interesting discussion, only the original poster was never seen again...

Good night

Thomas

HermannGahm · ‎2009 May 06

Hi Thomas,

just another comment.... :

If i remember right... you work mostly on ORACLE.

For ORACLE 10g you can run the following SELECT

(in the ABAP Report RSORADJV for example):


SELECT
    stat_name,
    round(value/1000000) time_waited_sec
FROM
    v$sys_time_model
WHERE stat_name LIKE 'DB%' 
OR    stat_name LIKE '%parse%' 
ORDER BY time_waited_sec desc

you will get an output like this:


STAT_NAME                                        TIME_WAITED_SEC

DB time                                          10928573
DB CPU                                           5810338
parse time elapsed                               44891
hard parse elapsed time                          32916
failed parse elapsed time                        497
hard parse (sharing criteria) elapsed time       380
hard parse (bind mismatch) elapsed time          47
failed parse (out of shared memory) elapsed time 0

all times are in seconds.

Now you can see how much time is needed for parsing.

You can see as well how much CPU time (in total) is used.

This time includes parsing... and recursive sql and scanning

the buffer cache and latches and and... . And you get the total db time

this time includes DB CPU + i/o + time for enque + everything else

for what db consumes time with... .

You can see how big the influence of parsing is with respect

to DB time and DB CPU time.

If you have time... ... you can then order all your field lists and

where conditions... run the query again... and see if parsetimes

decrease (what should be the case). But I can't promise you if

you would FEEL a difference in overall DB performance...

And btw: As a rule of thumb we assume that in a well tuned system

DB CPU time is 40 % of DB time.... if it is less... we assume that

there is still improvement potential regarding SQL tuning... indexes...

Kind regards,

Hermann

and just another adition:


SELECT * 
FROM
V$SGASTAT
WHERE
name = 'sql area'

shows you how big your sql area is inside the shared pool is.

The hitratios can be seen on the main screen in ST04:



Shared Pool
...
SQL Area getratio  %          100,0
         pinratio  %          100,0
...

Edited by: Hermann Gahm on May 6, 2009 8:46 AM

ThomasZloch · ‎2009 May 06

> ABAP Report RSORADJV

very interesting, many thanks.

> If you have time... ... you can then order all your field lists and

> where conditions...

you guessed it, not enough time, let alone the administrational effort of moving code changes to production...

> And btw: As a rule of thumb we assume that in a well tuned system

> DB CPU time is 40 % of DB time.... if it is less... we assume that

> there is still improvement potential regarding SQL tuning... indexes...

36% here, I'll have a look...

Cheers

Thomas

HermannGahm · ‎2009 May 06

Hi Thomas,

don't nail down the 40 %... your 36 % ... to me sounds already like a well tuned system...

If you would have said... 10 % CPU... i would recommend sql cache and wait event analysis...

but 36 % is quite ok i would say...

Kind regards,

Hermann

Former Member · ‎2009 May 05

the order of the field in the WHERE-clause does not influence the decision which index is used.

Everybody you thinks differently should think about, how he would design a design logic,

only a simple one following the order of WHERE-clause or trying different orders.

the order of the fields in the index is essential, it is like a printed telephone book, a new order is a different book.

However, the statements are stored, to used the same processing later again, the decision which index to use can take longer than a simple SELECT SINGLE. To find the stored statement again, the WHERE condition should have the same order, otherwise this cache becomes larger and slower. That is the effect of the order of the fields in the WHERE-condition.

Former Member · ‎2009 May 05

>


> Hi Expert,
> 
> Got a few questions here.
> 
> 1. What does access predicates and filter predicates in ST05/Performance trace means? Does it influence anything when creating an index?
> 2. When you create an index, does the order of the column/field matters?
> 
> 
> Thanks in advance.

Hi Eida,

1. good statements made by others

2. Index

creation

When it comes to design an index you have to KNOW the queries that your application will fire to the table.

There is no general rule except that the index should fit with the access pattern.

I.e.

maybe your app would have a lot of queries of these types:

select * from tab where col1 = x and col2 = y

select * from tab where col2 = y

it would make sense to order the index keys that way: (COL2,COL1)

discriminating myth: "You should set the most discriminating index column in front ."

COL1: 10000000

COL2: 1000000

myth suggested order: (COL1, COL2)

If you consider the columns C1 and C2 as vectors and have to compare them togehther in a WHERE it doesn't matter which comes first.

Furthermore, techniques like index key compression imply that you put the least discriminant in front

to get a good compression factor.

bye

yk

HermannGahm · ‎2009 May 05

Hi Yukonkid,

>Furthermore, techniques like index key compression imply that you put the least discriminant in

> front to get a good compression factor.

indeed. That's another good argument why order of index fields should not be based on nr. of distinct keys only... . Thank you. I never thought of this in this combination so far

Kind regards,

Hermann

Former Member · ‎2009 May 05

yes

..use the fileds in order how it is there in the Table or use the order of the index created..

check in DB05 transaction which Index is using..

regards,

Prabhudas

Former Member · ‎2009 May 05

Thanks for all the post. I'm still digesting all the inputs.

Former Member · ‎2009 May 08

@YukonKid

> discriminating myth: "You should set the most discriminating index column in front ."

> COL1: 10000000

> COL2: 1000000

> myth suggested order: (COL1, COL2)

> If you consider the columns C1 and C2 as vectors and have to compare them togehther in

> a WHERE it doesn't matter which comes

sorry, I am not convinced! The actual execution is never an issue, it is using the complete correct data. But optimizer decides which path to use based on assumptions, and there a good index design matters!

Plus your example is simply not good, we are not BW, the numbers are much smaller!

The question is the following, not huge and nearly huge, but huge and medium

COL1: 10.000.000

COL2: 10.000 or 1.000

Siegfried

Former Member · ‎2009 May 08

> I understand the argument about the unnecessary parsing of SQL statements, but I would like to

> know, what effect does this have on the overall performance of a typical SAP installation? I assume

> much less than 1%? (comparing systems where this coding rule is being followed and those where it's > not)

the 1% is probably correct, if you count the parsing of all duplicate statements in the cursor cache and

compare them with all DB executions .... this must be much much less than 1%. But is this really interesting?

There is one argument which is sufficient for me, the statictics of the cursor cache. If statements which are in principle identical a spread over different entries, the I get no good overview on the DB performance. Maybe there is statement with problems which has 10 WHERE-conditions, then it can be distributed over ten or more entries and it will not be the top line.

Siegfried

Former Member · ‎2009 May 08

> And btw: As a rule of thumb we assume that in a well tuned system

> DB CPU time is 40 % of DB time.... if it is less... we assume that

> there is still improvement potential regarding SQL tuning... indexes...

less is better, even with percentages ...

I don't like these overall percentages too much, they are good for the tuning of a whole system.

People here look at particular applications, and there is no general rule for DB-percentages.

Check the standard benchmarks they a range start from 20%.

HermannGahm · ‎2009 May 08

Hi Siegfied,

yes you are right. We use this approach (Ratio based tuning) only for DB SYSTEM Performance Analysis. Our experience (that is typically not a benchmark system) has shown that if CPU is much less than 40% a wait event analysis and subsequent wait event tuning (could include sql tuning) often leads to better db system perormance. (On the other hand... if cpu is much more than 40 % expensive sql may be the case for that... .) So it is just a rule of thumb (which is never a hard benchmark) and a first indicator. (Unfortunatelly, I never had the chance to work for benchmarks so far)

For applications we prefer the time based (traces) over the ratio based approach as well

Kind regards,

Hermann

Former Member · ‎2009 May 08

> discriminating myth: "You should set the most discriminating index column in front ."

If this is a myth, what is correct ?????

HermannGahm · ‎2009 May 09

i think what Yubkon Kid meant is, that the order of fields in the index shold not be based

on nr. of distinct values per field ONLY.

I think he meant that there are other tings that are as important as the number of the distinct keys... how the fields are used e.g. ... and even other things that could play a role sometimes.... e.g. index compression (if that should be use), data distribution in some cases... and so on....

Former Member · ‎2009 May 10

usually it is no good idea to have objections against something without offering a better

solution. And that is here the case.

Every rule has exceptions ... but how much sense does it make to start a discussion with the exceptions?

Former Member · ‎2009 May 11

HI Guys,

Thanks for the answer. All the feedback are indeed very helpful.

I will close the thread for now or you guys let me know if you want to discuss this much more. Then I will keep the thread open.

Former Member · ‎2009 May 11

I see a lot of theoretical answers here to the issue of column order and selectivity in an index. Let's look at a real example - my favorite table - BKPF.

The primary keys to this table are BUKRS, BELNR and GJAHR. Here BELNR is obviously the most selective field here, but it is in the second position. Does this make any difference? I don't know.

Where I work, there are just over 8,000,000 entries in BKPF. Just under 8,000,000 are for one company code. About 62,000 are spread between 12 or so other company codes (some of which are no longer used). So, is BUKRS highly selective or not? I guess it depends on which company code you are interested in. Normally, it's the one that is most used.

I Haven't checked, but I assume that documents are more or less evenly split by fiscal year. More than company code at least.

What does this mean? I think that when you are looking for documents that fulfil certain conditions, you are doing just that - trying to find out the document number, given other conditions like fiscal year or company code.

So, I would say that the fields that belong at the beginning of the index are the ones that you are most likely to know, not the ones that are more or less selective.

So why is GJAHR the last fiels in this index? Beats me.

Rob

PS @YK - I am reminded of your comments to one of my blogs about a skip index in Oracle (at least I think that was what it was called). Don't know if that is relevant here.

ThomasZloch · ‎2009 May 11

My guess is that BUKRS is first because from an organizational point of view most data selections in a productive system will be within one company code, rather than cross-company.

GJAHR might be third and not second because a search for just a document number within a company is more likely than a search for multiple document numbers within one fiscal year, so that the system can quickly present a popup with all fiscal years where this particular document number has been assigned.

Or maybe GJAHR was added subsequently (in 1985 or so...) when SAP found out that it would be nice to reset number ranges per fiscal year? Who knows...

So I'm also saying, selectivity is one thing, functional requirements another.

Thomas

HermannGahm · ‎2009 May 12

Hi Rob,

>The primary keys to this table are BUKRS, BELNR and GJAHR. Here BELNR is obviously the most selective field here, but it is in the second position.

No CLIENT (MANDT) in your table and index? really?

(Is the client selective? Usually not, sometimes yes)

> I am reminded of your comments to one of my blogs about a skip index in Oracle (at least I think that was what it was called). Don't know if that is relevant here.

The skip scan was introduced in ORACLE 9 (the talbe BKPF was introduced before ). And this access feature is (currently) only supported by ORACLE. Therefore i don't think that any index field order is choosen because of the skip scan feature. And even if you built your own tables and indexes... i would try to build the indexes in a way that if there was no skip scan.

Back to the order of the fields in the index... HOW you use the fiels is (for me) as important:

See as note 825653 point 2:

2. Indexes should be set up in a way that the selection fields are

located at the beginning of the index.

From a performance point of view, it does not matter whether

selection or non-selection fields (such as MANDT) are located at the

beginning of the index. It is much more important that fields that

are not specified with "=" in WHERE conditions and that cover a

large value range are located as near to the end of the index as

possible. In addition, the index should not contain any gaps that

are not specified in the WHERE condition. See also Note 766349.

Kind regards,

Hermann

By Category

Related Content

Activity Groups

Industry Groups

Influence and Feedback Groups

Interest Groups

Location Groups

Customer Only Groups

Forums

Related Resources

Products

Learning and Support

About

My SAP Profile

My SAP Profile

ST05