Solved: Performance issue while deleting entries from inte...

Former Member · ‎2016 Mar 07

Hello,

I have two internal tables. Both internal tables have more than 50,000 rows.

The scenario is: I want to delete data in one internal table based on other internal table entries.

I have tried both below scenarios:

1. Looping one internal table and deleting entries in other internal table based on where condition

Loop at itab1.

Delete itab2 where field = itab1-field.

Endloop.

2. Converting the values in itab1 into ranges and then deleting from ranges.

Delete itab2 where field in r_range1.

But both syntaxes are taking lot of time. I understand that since there are more than 50,000 records, it could take some time.

But the user is running the report online. So, waiting for 5 minutes is becoming an issue.

Is there any better syntax than the two described above?

I have tried searching forums, but most of the thread talk about Database deletes and not internal table deletes.

davis_raja · ‎2016 Mar 07

Insert a new field in the internal table which you want to delete.

Check for your condition using a loop and read statement.

If your condition matches, make an entry or set the flag in the field and modify it inside the loop.

Then use DELETE statement outside the LOOP.

DELETE itab WHERE flag EQ 'X'.

davis_raja · ‎2016 Mar 07

Insert a new field in the internal table which you want to delete.

Check for your condition using a loop and read statement.

If your condition matches, make an entry or set the flag in the field and modify it inside the loop.

Then use DELETE statement outside the LOOP.

DELETE itab WHERE flag EQ 'X'.

SimoneMilesi · ‎2016 Mar 07

And you'll get almost the same result.


1. Looping one internal table and deleting entries in other internal table based on where condition

Loop at itab1.
Delete itab2 where field = itab1-field.
Endloop.

1- ITAB1-FIELD is unique or can exist multiple times?

2- ITAB2 is sorted by the deletion field?

SORT ITAB1 by FIELD

SORT ITAB2 by field

LOOP at ITAB1

at end of FIELD

delete itab2 where fields = itab1-field

endloop.


2. Converting the values in itab1 into ranges and then deleting from ranges.

Delete itab2 where field in r_range1.

same as before: check R_RANGE1 is contains not duplicates and sort ITAB2.

Another option more "nice&clean" is to create a secondary index on ITAB2 and use it in deleteion

Former Member · ‎2016 Mar 07

Thanks of answering.

Yes the fields are sorted and are unique.

I had tried using secondary index, but I am getting below syntax error regarding declaration.

Can you please help me with syntax error? I am not able to declare a key that is not sorted.

SimoneMilesi · ‎2016 Mar 07

Try "WITH NON-UNIQUE KEY key COMPONENTS vbeln"

matt · ‎2016 Mar 07

I've written a blog about the use secondary keys. Maybe if you search for it, it'll be helpful.

Former Member · ‎2016 Mar 07

I ended up declaring it as a hashed table with unique primary key.

But there is barely any improvement in performance.

matt · ‎2016 Mar 07

What does sorange contain? That's kind of important.

I'd have written it something like this:

data lt_rcsos type hashed table of Solist with unique key vbeln.

insert lines of rcsos into table lt_rcsos.

delete lt_rcsos where vbeln in sorange.

Former Member · ‎2016 Mar 08

It is a range with VBELN as low and high.

matt · ‎2016 Mar 08

Yeah - I know what a range or a select-option is. What values does it contain?

former_member210008 · ‎2016 Mar 08

After you changed table contents, key must be updated. On large tables it takes some time. So that is why you dont have improvement IMO.

You can try two ways:

1) add new field flag to use it for filter on output, i.e. flag = 'X' - show it, flag = ' ' - dont show. Pros: you dont need to delete records - save some time.

2) try to use filter or binary search (depends on your abap version).

sort itab2 by field.

loop at itab1 assigning <itab1>.

clear sy-subrc.

while sy-subrc = 0. "if your field is not unique

read table itab2 with key field = <itab1>-field binary search assigning <itab2>.

if sy-subrc = 0.

delete itab2 index sy-tabix.

"OR

"copy row <itab2> to new table itab3. Test both variants.

endif.

endwhile.

endloop.

Former Member · ‎2016 Mar 10

Read within a loop will take more time I guess.

SuhaSaha · ‎2016 Mar 10


Chinmay Kulkarni wrote:

I ended up declaring it as a hashed table with unique primary key.
But there is barely any improvement in performance.

This is because your WHERE condition contains IN operator. Optimization takes places for '=' operator only. Read Optimization of the WHERE Condition - ABAP Keyword Documentation.

ThomasKruegl · ‎2016 Mar 08

If you are on 7.40, SP08 or higher you can try Filter:

I don't know about its performance though.

regards,

Thomas

Former Member · ‎2016 Mar 08

It's a good hint to use the FILTER operator, if possible due to your ABAP release. I used already the FILTER operator to create intersections of tables. It's short and easy. In addition you should use key definitions for your internal table; maybe secondary keys. Using keys should speed up your runtime performance.

Regards

Armin

Former Member · ‎2016 Mar 10

Sorry, we are on lower release.

Former Member · ‎2016 Mar 08

Declare itab2 as a sorted table with the appropriate key.

loop at itab1

read table itab2 with your key.

if the return code is 0

delete itab2 index sy-tabix

If there are multiple records in itab2 with the same key, you will have to add extra logic to handle that.

Rob

Former Member · ‎2016 Mar 10

As mentioned to Evgeniy, I think it might worsen the performance as it is read statement within loop.

Former Member · ‎2016 Mar 09

Another idea: Is it possible to prevent the filling of the second table, where you want to delete lines by now?

Former Member · ‎2016 Mar 10

Great suggestion, but I had already done that. Reduced time by 50% by reducing the entries in second table. (Keeping only relevant entries).

I was looking for further improvement.

RicardoRomero_1 · ‎2016 Mar 09

Hi,

My grain of sand; I don't know if it'll improve the performance, try and tell us;


SORT itab2 BY field. "Or declare it as SORTED 

  LOOP AT itab1.
   
    READ TABLE itab2
     WITH KEY field = itab1-field
     BINARY SEARCH.


    IF sy-subrc EQ 0.


      LOOP AT itab2
        FROM sy-tabix
        WHERE field = itab1-field.
       
        DELETE itab2 INDEX sy-tabix.
       
      ENDLOOP.


    ENDIF.

ENDLOOP.

Former Member · ‎2016 Mar 10

As mentioned to Rob and Evgeniy, I think it will worsen the performance as it is a nested loop.

Former Member · ‎2016 Mar 10

Hi,

The test in this ABAP Keyword Documentation maybe useful for you, take a look.

Regards

Bill

Former Member · ‎2016 Mar 10

This message was moderated.

Former Member · ‎2016 Mar 10

Thank you everyone for your responses.

I have gone with hashed table with unique key which still takes a bit of time, but user has agreed to wait.

There are some great suggestions with syntax filter, but we are on lower release, hence, could not try them out.

Former Member · ‎2016 Mar 10


Chinmay Kulkarni wrote:


I have gone with hashed table with unique key which still takes a bit of time, but user has agreed to wait

Then you've missed the point.

Rob

Former Member · ‎2016 Mar 10

Bad answers marked as "correct" really bother me. Try:

REPORT zscntest.

DATA: bkpf_int TYPE TABLE OF bkpf,

bseg_int TYPE SORTED TABLE OF bseg WITH UNIQUE KEY

bukrs belnr gjahr buzei.

FIELD-SYMBOLS: <bkpf> TYPE bkpf,

<bseg> TYPE bseg.

DATA: start TYPE i,

end TYPE i,

dif TYPE i,

no_bkpf TYPE i,

no_bseg TYPE i.

SELECT * FROM bkpf INTO TABLE bkpf_int UP TO 50000 ROWS.

SELECT * FROM bseg INTO TABLE bseg_int

FOR ALL ENTRIES IN bkpf_int

WHERE bukrs = bkpf_int-bukrs

AND belnr = bkpf_int-belnr

AND gjahr = bkpf_int-gjahr.

DESCRIBE TABLE bkpf_int LINES no_bkpf.

DESCRIBE TABLE bseg_int LINES no_bseg.

WRITE: /001 'Initial number of BKPF_INT records:', no_bkpf,

/001 'Initial number of Bseg_INT records:', no_bseg.

GET RUN TIME FIELD start.

LOOP AT bkpf_int ASSIGNING <bkpf>.

PERFORM delete_itab.

ENDLOOP.

DESCRIBE TABLE bkpf_int LINES no_bkpf.

DESCRIBE TABLE bseg_int LINES no_bseg.

WRITE: /001 'Final number of BKPF_INT records :', no_bkpf,

/001 'Final number of BSEG_INT records :', no_bseg.

GET RUN TIME FIELD end.

dif = end - start.

WRITE: /001 'Time for deletion :', dif, 'microseconds'.

And you get:

Initial number of BKPF_INT records: 50,000

Initial number of Bseg_INT records: 139,250

Final number of BKPF_INT records : 50,000

Final number of BSEG_INT records : 0

Time for deletion : 243,595 microseconds

(Less than one quarter of a second)

The form DELETE_ITAB has exactly the logic I described in my first answer.

Rob

Message was edited by: Rob Burbank

Former Member · ‎2016 Mar 10

My 2 cents to Ricardo's answer:

At the inner loop, I would not do a WHERE. Just loop from SY-TABIX and delete while the key matches, else EXIT. It is my understanding that the WHERE still checks the subsequent entries in case one may match (it does not necessarily use a sorted table). Since you sorted explicitly, you know you can exit upon the first mismatch and skip that.

Clemenss · ‎2016 Mar 10

Hi,

althouh we already got a fast and efficient solution using hashed and/or sorted tables: Try to use every LOOP AT ITAB ASSIGNING <field-symbol>.

This will speed up your programs in the first place.

Never (means never!) use LOOP ... INTO.

Regards,

Clemens

Former Member · ‎2016 Mar 11

Sorry, but I have to disagree about "Never user LOOP...INTO". It depends on the line type of your internal table i.e. the length of one line. If you have e.g. a list of references (one field), it's better to use INTO.

But for the topic of this discussion, I think it's a good hint to avoid wasting time by copying data all the time.

Clemenss · ‎2016 Mar 11

It is not worse, why is it better? I mean if you got used to use field-symbols with all your loops (with ABAP 740 even easier when declared on the fly), you will never ever again face performance issues based on unncecessary copies and never ever again forget or skip any required MODIFY statement.

Have fun!

Clemens

Former Member · ‎2016 Mar 11

Sometimes it's faster to copy than assigning, but only for small datatypes. So, it's a case by case decision using INTO, ASSIGNING or maybe also REFERENCE INTO. If you make changes to the content of the table inside the LOOP, of course using INTO and then MODIFY is worse.

By the way, myself use often ASSIGNING, because my used datatypes most of the time are larger than some Bytes.

So, what I want to say: It's not worse to use INTO in general.

By Category

Related Content

Activity Groups

Industry Groups

Influence and Feedback Groups

Interest Groups

Location Groups

Customer Only Groups

Forums

Related Resources

Products

Learning and Support

About

My Account

My Account

Performance issue while deleting entries from internal table