‎2016 Sep 01 8:24 AM
Hello Friends ,
My requirement is to compress huge volume of data group by field called GUID .
I have internal table with huge volume of data ( more than 2 crore records ) . While sorting the internal it gives dump by stating not enough memory . When I try to assign these all records to sorted table again it gives dump not enough memory space .
My issue is that I can not sort internal table by GUID but I need to collect all the records with similar GUID and compress . While I divide data and then process it doesnot give dump but program runs in background for more than 2 days . My code is as below : Please suggest how can I optimise this code .
DATA: lt_datval TYPE STANDARD TABLE OF ts_datval,
ls_datval TYPE ts_datval,
lt_compr TYPE STANDARD TABLE OF ts_compr,
ls_data_cmpr TYPE LINE OF ycocos_data_cmpr_t ,
lt_data_t TYPE ycocos_data_t ,
lt_guid TYPE /sdf/guid_22_tt ,
lv_guid TYPE guid_22 ,
lv_lines TYPE int4 ,
lv_flag TYPE boolean ,
lv_index1 TYPE int4 ,
lv_index2 TYPE int4 .
CONSTANTS : lk_batch_size TYPE int4 VALUE 50000 .
FIELD-SYMBOLS: <ls_data> TYPE LINE OF ycocos_data_t,
<ts_compr> TYPE ts_compr.
lv_index1 = 1 .
lv_index2 = lv_index1 + lk_batch_size .
APPEND LINES OF it_data FROM lv_index1 TO lv_index2 TO lt_data_t .
WHILE lt_data_t IS NOT INITIAL .
SORT lt_data_t BY guid .
DELETE ADJACENT DUPLICATES FROM lt_data_t COMPARING guid .
LOOP AT lt_data_t ASSIGNING <ls_data>.
READ TABLE lt_guid TRANSPORTING NO FIELDS WITH KEY table_line = <ls_data>-guid .
IF sy-subrc <> 0 .
MOVE <ls_data>-guid TO lv_guid .
APPEND lv_guid TO lt_guid .
CLEAR lv_guid .
ENDIF .
ENDLOOP.
REFRESH lt_data_t .
lv_index1 = lv_index2 .
lv_index2 = lv_index1 + lk_batch_size .
APPEND LINES OF it_data FROM lv_index1 TO lv_index2 TO lt_data_t .
ENDWHILE.
SORT lt_guid BY table_line .
LOOP AT lt_guid INTO lv_guid .
*-----collect all data from it_data and compress
LOOP AT it_data ASSIGNING <ls_data> WHERE guid = lv_guid .
IF lv_flag = abap_false .
CLEAR: ls_data_cmpr, lt_compr, lt_datval.
MOVE-CORRESPONDING <ls_data> TO ls_data_cmpr.
lv_flag = abap_true .
ENDIF .
MOVE-CORRESPONDING <ls_data> TO ls_datval.
INSERT ls_datval INTO TABLE lt_datval.
ENDLOOP.
*---compress data
CALL FUNCTION 'TABLE_COMPRESS'
IMPORTING
compressed_size = ls_data_cmpr-data_size
TABLES
in = lt_datval
out = lt_compr.
LOOP AT lt_compr ASSIGNING <ts_compr>.
ls_data_cmpr-linenum = sy-tabix.
ls_data_cmpr-rawval = <ts_compr>-rawval.
INSERT ls_data_cmpr INTO TABLE et_data_cmpr.
ENDLOOP.
CLEAR lv_flag .
ENDLOOP.
ENDIF .
‎2016 Sep 01 9:07 AM
hi,
please can you elborate on where you are going wrong,
regards,
vinay..
‎2016 Sep 01 9:49 AM
Hello Vinay ,
Issue since there is huge volume of data above code is taking almost 2 days to execute in background which is not acceptable . I can not use parallel cursor technique because I am not able to sort the internal table as mentiodned above . All these above code is in function module .
How can I optimize this code so that data is processed within some hours and not in days .
‎2016 Sep 01 9:57 AM
‎2016 Sep 01 10:16 AM
hi ,
why don't you use sorted internal table like below So we dont need to sort internal table explicitly automatically it will sort right .
DATA: lt_datval TYPE SORTED TABLE OF ts_datval,
‎2016 Sep 01 10:50 AM
I have already mentioned I sorted table gives dump while assignment . Since data is coming from another server I cant user orderby/group by statement in select , Whatever I can do is with this records only
‎2016 Sep 01 10:52 AM
Can I call this function in update or background task but there is no update . So will this work ?
‎2016 Sep 01 11:02 AM
‎2016 Sep 01 11:11 AM
Hi
I don't know your program but you could consider to work with pack of data in order not to use all memory, I mean you can upload a certain number of data, elaborate them and then upload the next pack.
Else you can try to work with parallel processes
Max
‎2016 Sep 01 11:28 AM
‎2016 Sep 01 12:13 PM
Hi
Check the help for CALL FUNCTION ...... STARTING NEW TASK
It's possible to create an RFC where you should place the part of your program with bad performance,.
In this way this part of program can be called several time at the same time.
Of course that means you need to arrange your program in order to move a part of it a RFC
The number of calls depends on how many processes can be available in your server
Max
‎2016 Sep 01 12:30 PM
It's possible to create an RFC where you should place the part of your program with bad performance,.This made me laugh
‎2016 Sep 01 1:02 PM
‎2016 Sep 01 3:06 PM
By the way, did you try declaring your itab sorted, as balu p proposed previously?
Or declare/use a secondary key?
If it doesn't work, another solution is to extend your current GUID internal table with an additional field which would be a reference to the corresponding row of your huge internal table (or a table of references to the corresponding rows). This way, you don't need to do a loop at the huge itab, you only loop at the referenced line(s).
‎2016 Sep 01 3:20 PM
Hi,
can you share the functional requirement? Is it in a CRM system?
Why not change the logic to use a join on the tables with GUID and then use pacakage size addition.
Kind regards, Rob Dielemans