cancel
Showing results for 
Search instead for 
Did you mean: 
Read only

How can we pull all records daily by using Dataflows

gourav63875
Explorer
0 Kudos
842

I want to generating a CSV file report for all users on daily basis. But according to documentation we can not set time less than 7 days to pull all records. If I do this manually from CDC console then It is working fine.

Let me know If someone has any idea to make it possible.

Thanks in advance.

Accepted Solutions (0)

Answers (2)

Answers (2)

Oleh_Ilchyshyn1
Active Participant

Hi Gourav,

I've investigated your case and looks like we can achieve it, but we need to rebuild dataflow in another way.
Below I provide you with the steps on how to rebuild it.
If any technical questions or you want to clarify smth, please don't hesitate to ask in the comments.

- We need to skip OOTB delta logic "Pull all records". So, we should get rid of the step:
datasource.read.gigya.account.

Our solution here is to call directly accounts.search API and fetch the users.

But before calling accounts.search API(using datasource.write.gigya.generic) you need to trigger the start of dataflow at least one time. For that, you can use the next trick:

When we initiated the start of dataflow execution, the next step should be calling accounts.search API.

So, here you need to use datasource.write.gigya.generic step.

Note: Do not forget that the default limit in the query is 300 records. So, you need to extend it -> SELECT UID,profile.email FROM accounts LIMIT 10000(I don't remember the current value for MAX limit, but If I am not mistaken it was 10000).

- And the last step is to use field.array.extract.

As we have got the response in the next format [{},{},{},...,{}] we need to transform it and make CDC processes it one by one as it is a separate record.

- And in the end, you should add steps to format .csv + write the report file somewhere.

The Dataflow, that I described above looks the following:

I've added Logger steps just to understand what is happening on the each of steps. Suggest you do the same.

So, I've scheduled this Dataflow on a regular basis, and as you can see, I am getting all my users from dev environment every 10 minutes.

Hope it helped you and is a workaround that you are looking for.

Oleh_Ilchyshyn1
Active Participant
0 Kudos

Hi gourav63875,

Have you checked a solution? Does it work for you?

kajolmaan
Explorer

Hi Gourav,

In order to run a dataflow on daily basis, we can set it from scheduler directly. Please find below screenshot for the same.

Kind regards,

Kajol

KunalBansal
SAP Champion
SAP Champion
0 Kudos

Interesting, thank you Kajol.

Oleh_Ilchyshyn1
Active Participant

Hi Kajol and Kunal,

Gourav is not asking "How to run a dataflow on daily basis", he is asking "How to run a dataflow on a daily basis with Pull all records configuration" as it is restricted with this configuration to a minimum of 7 days, but he wants to set up it daily. He provided a screenshot with the error restriction.

If you navigate to the documentation and can see the following restriction:

  • Full extract frequency: The smallest frequency to which a "full extract" dataflow can be set, i.e., the minimal time that must pass between each schedule for that flow. These are flows for which the "Pull all records" checkbox is flagged. Default: 7 days
gourav63875
Explorer

Yes, olehi94 your are right,
That's what exactly I want to know.

Let me know if any body have any idea Or logic to achieve same.

Thanks & Regard

Gourav Sharma