cancel
Showing results for 
Search instead for 
Did you mean: 
Read only

extract all users from sap cdc

0 Kudos
1,998

Hi,

I need to extract all the users from sap cdc however we can only extract 10000 users only. In data flow how can I export all users also please let me know if there are any other possible ways

Accepted Solutions (0)

Answers (2)

Answers (2)

Vrishabh
Explorer

Hi Maheswari,

You can export the site data with 2 methods,

  1. DataFlow method.

A: You can build a simple dataflow by using the steps,

  • datasource.read.gigya.account: This includes a simple SQL query. You can use where clause to further filter out the data.
  • file.format.dsv: This step formats the data into the opted format (we can also export data in .json format). For other format options click here.
  • datasource.write.amazon.s3: This step configures the storage service provider. For other storage options click here.
  • Custom scripts (optional): You can definitely play around with custom scripts within the data flow to streamline your export data.

This dataflow can be scheduled, and the process dumps the formatted file in one of the storage options provided. If you have a requirement to export the data from another site (apiKey), you can copy the same source code and do the same action. The effort of building a dataflow is one time.

  1. API method.

A: You can also export the data using accounts.search API. By default the max data which can be exported is 10,000. But if your data is more than that, please follow the below steps,
[make sure to add other basic required parameters including query]

  • When you are passing the parameters during the REST API call, make sure to set the openCursor as true only for the first time.
  • The output from the first REST API call has a field nextCursorId, which needs to be stored for second call.
  • When making a second REST API call, make sure to replace openCursorId with cursorID and the value of cursorID should be the value of nextCursorId from previous call.
  • The step 3 should be repeated until you get all the data.

For more details on the above method please click here.

The above method is best when the data is more than 10,000 and this method makes sure you don't get repeated data. The results of 1st batch would not be present in any other batch. You will always get unique data every single time. At the end of the data, your nextCursorId will be expired.

KunalBansal
SAP Champion
SAP Champion

Very detailed answer, Vrishabh.

SebastianSchuck19
Active Participant

Hey,

You have to build a data flow that aggregates the account.search page results into a data storage (like azure blob storage). Check out the available "datasource.write" in combination with the "file.format" components
at https://help.sap.com/docs/SAP_CUSTOMER_DATA_CLOUD/8b8d6fffe113457094a17701f63e3d6a/414ce1d070b21014b... . This way you can export all user records using a single data flow.

Or create a script utilizing the given REST APIs to aggregate the data in memory and write all records into an output file.

Best,
Sebastian Schuck