‎2010 May 12 10:20 AM
Hi SDNites,
I have a requiremnet where I have huge amount of data in my internal table and we want that data to be processed using parallel processing. Can you please let me know what are the different ways using which I can meet my objective.
Please provide me details with some example as I
Regards,
Abhi
‎2010 May 12 10:56 AM
Hi,
Hope this link will help you http://wiki.sdn.sap.com/wiki/display/Snippets/CopyofABAPCodeforParallelCursor-Loop+Processing
Regards,
Pravin
‎2010 May 12 11:00 AM
Thanks for replying Pravin.
My question is related to parallel processing and not parallel cursor.
Regards,
Abhi
‎2010 May 12 11:20 AM
[CALL FUNCTION .. STARTING NEW TASK|http://help.sap.com/abapdocu_70/en/ABAPCALL_FUNCTION_STARTING.htm] is probably what you're looking for. This basically allows you to do parallel processing, where you can still return the results of your processing back to your main driver program ([calling/performing .. on end of task|http://help.sap.com/abapdocu_70/en/ABAPCALL_FUNCTION_STARTING.htm#!ABAP_ADDITION_2@2@]). If you don't need that, you could use any of the [RFC call types|http://help.sap.com/abapdocu_70/en/ABAPCALL_FUNCTION_DESTINATION-.htm] other than (obviously) synchronous RFC.
With the asynchronous RFC you have good features for limiting/configuring resource consumption ([destination in group|http://help.sap.com/abapdocu_70/en/ABAPCALL_FUNCTION_STARTING.htm#!ABAP_ADDITION_1@1@]).
The only other option I'm aware of would be to have a driver program that creates multiple background jobs (a technique that you can also see in some of the standard SAP applications).
Anyhow, if your individual parallel processing units require some general post-processing routine that works on the combined/all results, you'd need some trigger (i.e. [semaphore|http://en.wikipedia.org/wiki/Semaphore_%28programming%29]) for that.
Another interesting challenge for parallel processing is to handle error recovery, i.e. to ensure that you don't have to redo all processing if one part fails. Depending on your process/requirements that probably ranges from trivial to awkward.
Cheers, harald
‎2010 May 12 2:23 PM
Thanks for the in-depth description.
Also can you please tell me whether the parallel processing requirement can be done using FM (JOB_OPEN / JOB_CLOSE). If yes how can that be done?
Regards,
Abhi
‎2010 May 12 8:00 PM
Also can you please tell me whether the parallel processing requirement can be done using FM (JOB_OPEN / JOB_CLOSE). If yes how can that be done?
Yes, it can be done this way, that's what I was referring to when I mentioned multiple background jobs. If you're not familiar with how to do it, check out the ABAP online help on the [SUBMIT .. VIA JOB|http://help.sap.com/abapdocu_70/en/ABAPSUBMIT_VIA_JOB.htm] statement, which also contains a short example. A more in-depth and good description of your options can be found in the online help [Programming with the Background Processing System|http://help.sap.com/saphelp_nw70ehp2/helpdata/en/fa/096c53543b11d1898e0000e8322d00/frameset.htm]. Here you can also find example programs for the RFC and the background job approach: [Implementing parallel processing|http://help.sap.com/saphelp_nw70ehp2/helpdata/en/fa/096e92543b11d1898e0000e8322d00/frameset.htm].
I personally think the RFC option has the major advantage that you have already a built-in capability for managing resource consumption (using RFC server groups). I'm not aware that you get this for free with background jobs (though if you have long-running tasks you cannot use RFC, because they are executed in dialog processes and thus face the same maximum runtime limit as normal dialog users).
Anyhow, in theory your task is rather simple, though of course the actual technical implementation can be quite challenging: Essentially you're trying to implement a [divide and conquer algorithm|http://en.wikipedia.org/wiki/Divide_and_conquer_algorithm], a real prominent example is Google's [MapReduce framework|http://en.wikipedia.org/wiki/MapReduce] (has nothing to do with ABAP, but the concept is of course valid and can be found nowadays in many applications).
So you just need to find a way to split-up your problem into smaller sub-problems, which you then solve individually. Should you have the need for a final step that combines all individual results (the reduce step in Google's framework), you would need with background jobs an approach for figuring out if all job steps are done (so essentially a [semaphore|http://en.wikipedia.org/wiki/Semaphore_%28programming%29] or [fork-join queue|http://en.wikipedia.org/wiki/Fork-join_queue], however you want to look at it).