a month ago
Dear Experts,
I hope this message finds you well.
I am reaching out regarding the replication flow in SAP Datasphere, which is generating multiple Parquet files (20 in my case) in the ADLS target source due to large number of records in source.
Could you kindly advise on how to consolidate these generated Parquet files into a single file containing all the data? Additionally, how we can increase file size to optimize number of generated files?
Thank you in advance for your assistance.
Best regards,
Arvind
Request clarification before answering.
Hi Arvind,
I am not sure if DSP can be setup here. Try to merge after created, or before you consume them. You can find merge tools on github like joinem. cat is not working well for parquet
Cheers
Martin
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
User | Count |
---|---|
68 | |
9 | |
7 | |
7 | |
6 | |
6 | |
5 | |
4 | |
4 | |
3 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.