on 2023 Mar 16 11:51 AM
We are using RFC_READ_TABLE to extract SAP data in batches. But the data extract via smaller batches are different to that of a full table extract.
As an example of the problem, when we read table UST04, we will only specify the columns to be retrieved and the batch size. When we extract the data in batch sizes of 100 000, the data being returned is different from when the table is extracted at once in full.
With the batches extract there are duplicate records extracted but not with the full extract. Some of the data in the full extract does not exist in the batch extract. The total number of records in both extracts are the same but the batches extracts have duplicates as well as missing records.
Could there be a problem caused by table data buffering when a batch size is specified, vs when a full table is extracted? Or could this possibly be caused by the next batch extract starting at the wrong place, therefore extracting duplicate records, but the extract then stopping when the full table count number is reached, but before all records have been extracted?
Has anyone experienced something similar to this before?
Request clarification before answering.
See the comment on this blog: https://blogs.sap.com/2017/05/21/perl-module-sapnwrfc-to-retrieve-data-from-a-huge-sap-table-with-rf...
Downloading a huge table takes often too much time as the ROWSKIPS parameter does not prevent the module from reading the data, it just does not add them to the returned data table DATA. So with each consecutive read you will read and read more data (i.e. if you chunk it by 500 records, first invocation will read 500 records, next will read 1,000 records, but only return 500, then we read 1,500 records and again only return 500, etc.).
Also, note that the extraction does not impose any ordering, so the actual data returned is not well defined - at least in theory. In practice the database most likely will pick one execution plan and stick to it, so we should get all table rows.
Now this is the relevant code from the FM.
SELECT * FROM (QUERY_TABLE) INTO <WA> WHERE (OPTIONS).
IF SY-DBCNT GT ROWSKIPS.
* copy all relevant fields into DATA (output) table
...
* end of loop at FIELDS_INT
APPEND DATA.
IF ROWCOUNT > 0 AND SY-DBCNT GE ROWCOUNT.
EXIT.
ENDIF.
ENDIF.
ENDSELECT.<br>
So what harald.boeing2 wrote can be seen to be correct. It's slow and since there's no ORDER BY on the SELECT, there is no guaranteed order on the database, so you may well get different results.
You can't use ROWCOUNT/ROWSKIPS to partition the data. If you know the key values you could use that. E.g. for UST04, first read all the BNAME's beginning with A, then B ...
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
To make a long story short:
Don't use RFC_READ_TABLE.
It's not a solution, it's a security vulnerability... In fact, SAP recommends to disable this FM in a productive system or at least limit the tables this FM is allowed to read (see SAP note 2246160). It's a relic from very old times, when there was no internet and hackers did not yet target SAP systems, which were running well-protected and isolated somewhere in the basement... Unfortunately SAP was not able to delete this FM, as "everybody" is using it in mission critical productive scenarios, because it is just "so convenient"...
But the shortcomings are obvious:
If you have dozens or even hundreds of tables you need to read, then RFC_READ_TABLE is probable still the "least effort". But if there are only 2-3 tables you need, then go the extra mile and check for each table:
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
User | Count |
---|---|
78 | |
12 | |
9 | |
8 | |
7 | |
5 | |
4 | |
4 | |
3 | |
3 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.