on 2022 Aug 30 1:56 PM
Hello All,
I am investigating the Generation 2 operators to extract a CDS view from on-premise S/4HANA into a data lake on cloud. I am using the 'Partitioned file' file mode in 'Binary File Producer' operator to generate CSV files in 'Append' mode to use the snapshot feature. This graph works but it generates part files which are very small in size about 6 MB. The part files have a naming convention of 'part-<uuid>-index.csv'.
I would like to generate larger files which are about 100 MB of size. So my idea is to write data into the same file until it reaches 100 MB of size. To be able to accomplish this, I would need to have control over the name of the file. If I can somehow influence the 'index' value in the header 'com.sap.headers.batch', it would work. So, I tried using a Python operator but this attribute 'index' is just read-only and I cannot modify it via the script.
Does someone have an idea how we can do this?
For example, in Generation 1 operators, I was able to accomplish this because I have full control over the file name it generates in the target.
Regards,
Sandesh
Request clarification before answering.
Perhaps the provided Merge File graphs for RMS could be useful to combine the files into larger ones: https://help.sap.com/docs/SAP_DATA_INTELLIGENCE/97fce0b6d93e490fadec7e7021e9016e/360f5276357945449d6...
To use it, use Run As and provide the details of your landing folder and target folder. Pay attention to the Resources in the graph as the defaults are quite small. I created 1 GB files with the following Resource settings:
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
User | Count |
---|---|
76 | |
30 | |
10 | |
8 | |
8 | |
7 | |
7 | |
5 | |
4 | |
4 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.