‎2016 Mar 07 11:16 AM
Hello,
I have two internal tables. Both internal tables have more than 50,000 rows.
The scenario is: I want to delete data in one internal table based on other internal table entries.
I have tried both below scenarios:
1. Looping one internal table and deleting entries in other internal table based on where condition
Loop at itab1.
Delete itab2 where field = itab1-field.
Endloop.
2. Converting the values in itab1 into ranges and then deleting from ranges.
Delete itab2 where field in r_range1.
But both syntaxes are taking lot of time. I understand that since there are more than 50,000 records, it could take some time.
But the user is running the report online. So, waiting for 5 minutes is becoming an issue.
Is there any better syntax than the two described above?
I have tried searching forums, but most of the thread talk about Database deletes and not internal table deletes.
‎2016 Mar 07 11:19 AM
Insert a new field in the internal table which you want to delete.
Check for your condition using a loop and read statement.
If your condition matches, make an entry or set the flag in the field and modify it inside the loop.
Then use DELETE statement outside the LOOP.
DELETE itab WHERE flag EQ 'X'.
‎2016 Mar 07 11:19 AM
Insert a new field in the internal table which you want to delete.
Check for your condition using a loop and read statement.
If your condition matches, make an entry or set the flag in the field and modify it inside the loop.
Then use DELETE statement outside the LOOP.
DELETE itab WHERE flag EQ 'X'.
‎2016 Mar 07 11:26 AM
And you'll get almost the same result.
1. Looping one internal table and deleting entries in other internal table based on where condition
Loop at itab1.
Delete itab2 where field = itab1-field.
Endloop.
1- ITAB1-FIELD is unique or can exist multiple times?
2- ITAB2 is sorted by the deletion field?
SORT ITAB1 by FIELD
SORT ITAB2 by field
LOOP at ITAB1
at end of FIELD
delete itab2 where fields = itab1-field
endloop.
2. Converting the values in itab1 into ranges and then deleting from ranges.
Delete itab2 where field in r_range1.
same as before: check R_RANGE1 is contains not duplicates and sort ITAB2.
Another option more "nice&clean" is to create a secondary index on ITAB2 and use it in deleteion
‎2016 Mar 07 12:39 PM
Thanks of answering.
Yes the fields are sorted and are unique.
I had tried using secondary index, but I am getting below syntax error regarding declaration.
Can you please help me with syntax error? I am not able to declare a key that is not sorted.
‎2016 Mar 07 12:47 PM
‎2016 Mar 07 1:12 PM
I've written a blog about the use secondary keys. Maybe if you search for it, it'll be helpful.
‎2016 Mar 07 2:47 PM
I ended up declaring it as a hashed table with unique primary key.
But there is barely any improvement in performance.
‎2016 Mar 07 8:45 PM
What does sorange contain? That's kind of important.
I'd have written it something like this:
data lt_rcsos type hashed table of Solist with unique key vbeln.
insert lines of rcsos into table lt_rcsos.
delete lt_rcsos where vbeln in sorange.
‎2016 Mar 08 12:07 PM
‎2016 Mar 08 12:55 PM
Yeah - I know what a range or a select-option is. What values does it contain?
‎2016 Mar 08 3:50 PM
After you changed table contents, key must be updated. On large tables it takes some time. So that is why you dont have improvement IMO.
You can try two ways:
1) add new field flag to use it for filter on output, i.e. flag = 'X' - show it, flag = ' ' - dont show. Pros: you dont need to delete records - save some time.
2) try to use filter or binary search (depends on your abap version).
sort itab2 by field.
loop at itab1 assigning <itab1>.
clear sy-subrc.
while sy-subrc = 0. "if your field is not unique
read table itab2 with key field = <itab1>-field binary search assigning <itab2>.
if sy-subrc = 0.
delete itab2 index sy-tabix.
"OR
"copy row <itab2> to new table itab3. Test both variants.
endif.
endwhile.
endloop.
‎2016 Mar 10 1:22 PM
‎2016 Mar 10 7:41 PM
Chinmay Kulkarni wrote:
I ended up declaring it as a hashed table with unique primary key.
But there is barely any improvement in performance.
This is because your WHERE condition contains IN operator. Optimization takes places for '=' operator only. Read Optimization of the WHERE Condition - ABAP Keyword Documentation.
‎2016 Mar 08 2:58 PM
‎2016 Mar 08 3:27 PM
It's a good hint to use the FILTER operator, if possible due to your ABAP release. I used already the FILTER operator to create intersections of tables. It's short and easy. In addition you should use key definitions for your internal table; maybe secondary keys. Using keys should speed up your runtime performance.
Regards
Armin
‎2016 Mar 10 1:23 PM
‎2016 Mar 08 7:52 PM
Declare itab2 as a sorted table with the appropriate key.
loop at itab1
read table itab2 with your key.
if the return code is 0
delete itab2 index sy-tabix
If there are multiple records in itab2 with the same key, you will have to add extra logic to handle that.
Rob
‎2016 Mar 10 1:25 PM
As mentioned to Evgeniy, I think it might worsen the performance as it is read statement within loop.
‎2016 Mar 09 8:20 AM
Another idea: Is it possible to prevent the filling of the second table, where you want to delete lines by now?
‎2016 Mar 10 1:27 PM
Great suggestion, but I had already done that. Reduced time by 50% by reducing the entries in second table. (Keeping only relevant entries).
I was looking for further improvement.
‎2016 Mar 09 9:04 AM
Hi,
My grain of sand; I don't know if it'll improve the performance, try and tell us;
SORT itab2 BY field. "Or declare it as SORTED
LOOP AT itab1.
READ TABLE itab2
WITH KEY field = itab1-field
BINARY SEARCH.
IF sy-subrc EQ 0.
LOOP AT itab2
FROM sy-tabix
WHERE field = itab1-field.
DELETE itab2 INDEX sy-tabix.
ENDLOOP.
ENDIF.
ENDLOOP.
‎2016 Mar 10 1:29 PM
As mentioned to Rob and Evgeniy, I think it will worsen the performance as it is a nested loop.
‎2016 Mar 10 9:46 AM
‎2016 Mar 10 10:19 AM
‎2016 Mar 10 1:31 PM
Thank you everyone for your responses.
I have gone with hashed table with unique key which still takes a bit of time, but user has agreed to wait.
There are some great suggestions with syntax filter, but we are on lower release, hence, could not try them out.
‎2016 Mar 10 4:26 PM
Chinmay Kulkarni wrote:
I have gone with hashed table with unique key which still takes a bit of time, but user has agreed to wait
Then you've missed the point.
Rob
‎2016 Mar 10 5:53 PM
Bad answers marked as "correct" really bother me. Try:
REPORT zscntest.
DATA: bkpf_int TYPE TABLE OF bkpf,
bseg_int TYPE SORTED TABLE OF bseg WITH UNIQUE KEY
bukrs belnr gjahr buzei.
FIELD-SYMBOLS: <bkpf> TYPE bkpf,
<bseg> TYPE bseg.
DATA: start TYPE i,
end TYPE i,
dif TYPE i,
no_bkpf TYPE i,
no_bseg TYPE i.
SELECT * FROM bkpf INTO TABLE bkpf_int UP TO 50000 ROWS.
SELECT * FROM bseg INTO TABLE bseg_int
FOR ALL ENTRIES IN bkpf_int
WHERE bukrs = bkpf_int-bukrs
AND belnr = bkpf_int-belnr
AND gjahr = bkpf_int-gjahr.
DESCRIBE TABLE bkpf_int LINES no_bkpf.
DESCRIBE TABLE bseg_int LINES no_bseg.
WRITE: /001 'Initial number of BKPF_INT records:', no_bkpf,
/001 'Initial number of Bseg_INT records:', no_bseg.
GET RUN TIME FIELD start.
LOOP AT bkpf_int ASSIGNING <bkpf>.
PERFORM delete_itab.
ENDLOOP.
DESCRIBE TABLE bkpf_int LINES no_bkpf.
DESCRIBE TABLE bseg_int LINES no_bseg.
WRITE: /001 'Final number of BKPF_INT records :', no_bkpf,
/001 'Final number of BSEG_INT records :', no_bseg.
GET RUN TIME FIELD end.
dif = end - start.
WRITE: /001 'Time for deletion :', dif, 'microseconds'.
And you get:
Initial number of BKPF_INT records: 50,000
Initial number of Bseg_INT records: 139,250
Final number of BKPF_INT records : 50,000
Final number of BSEG_INT records : 0
Time for deletion : 243,595 microseconds
(Less than one quarter of a second)
The form DELETE_ITAB has exactly the logic I described in my first answer.
Rob
Message was edited by: Rob Burbank
‎2016 Mar 10 7:12 PM
My 2 cents to Ricardo's answer:
At the inner loop, I would not do a WHERE. Just loop from SY-TABIX and delete while the key matches, else EXIT. It is my understanding that the WHERE still checks the subsequent entries in case one may match (it does not necessarily use a sorted table). Since you sorted explicitly, you know you can exit upon the first mismatch and skip that.
‎2016 Mar 10 9:43 PM
Hi,
althouh we already got a fast and efficient solution using hashed and/or sorted tables: Try to use every LOOP AT ITAB ASSIGNING <field-symbol>.
This will speed up your programs in the first place.
Never (means never!) use LOOP ... INTO.
Regards,
Clemens
‎2016 Mar 11 7:50 AM
Sorry, but I have to disagree about "Never user LOOP...INTO". It depends on the line type of your internal table i.e. the length of one line. If you have e.g. a list of references (one field), it's better to use INTO.
But for the topic of this discussion, I think it's a good hint to avoid wasting time by copying data all the time.
‎2016 Mar 11 8:35 AM
It is not worse, why is it better? I mean if you got used to use field-symbols with all your loops (with ABAP 740 even easier when declared on the fly), you will never ever again face performance issues based on unncecessary copies and never ever again forget or skip any required MODIFY statement.
Have fun!
Clemens
‎2016 Mar 11 10:45 AM
Sometimes it's faster to copy than assigning, but only for small datatypes. So, it's a case by case decision using INTO, ASSIGNING or maybe also REFERENCE INTO. If you make changes to the content of the table inside the LOOP, of course using INTO and then MODIFY is worse.
By the way, myself use often ASSIGNING, because my used datatypes most of the time are larger than some Bytes.
So, what I want to say: It's not worse to use INTO in general.