cancel
Showing results for 
Search instead for 
Did you mean: 

How to make one file from multiple parquet files in datasphere replication flow

Arvind_Sharma
Explorer
0 Kudos
135

Dear Experts,

I hope this message finds you well.
I am reaching out regarding the replication flow in SAP Datasphere, which is generating multiple Parquet files (20 in my case) in the ADLS target source due to large number of records in source.

Could you kindly advise on how to consolidate these generated Parquet files into a single file containing all the data? Additionally, how we can increase file size to optimize number of generated files?

Thank you in advance for your assistance.
Best regards,
Arvind

 

View Entire Topic
Martin_Kuma
Active Participant

Hi Arvind, 

I am not sure if DSP can be setup here. Try to merge after created, or before you consume them. You can find merge tools on github like joinem. cat is not working well for parquet

 

Cheers

Martin