Read flat file within data flow using query transf...

Former Member · ‎2015 Apr 28

Hello Data Services Community! I have a Data Services question that I hope someone here can shed light upon.

Here's the situation: my client receives many small flat files (CSV) from remote devices every 15 minutes into a specific project folder. There are many projects and, thus, many folders. There is a specific naming convention for the files and the folders. The parent folder name contains the project identifier and the file name contains the device counter. The combination of the project identifier and the device counter allows my client to identify which type of device is sending the file and, thus, to which HANA table the data contained in the file should be delivered. Each type of device sends data with a completely unique column sequence and so the target HANA tables are different. Fortunately, there aren't too many device types (less than 10).

Here's what I've done so far: I have a workflow with a script to recursively read the folder structure and create a text file with all of the files that match the appropriate file type - in this case, *.CSV. In my mind, this will allow me to capture a snapshot of the files to process at that instant without having to worry about other files arriving in the meantime. Next, I have a data flow that pulls in this newly created text file. I added a query transform to parse the project identifier and the device counter and then another query transform to lookup the device type (class). The case transform that follows takes the lookup value and "should" route the data to the appropriate HANA table.

Conceptually, I think I should have another flat file source element after the case transform. The file name for that source element comes from the original text file described above. That is what I've tried to create here with the embedded data flow.

Above you can see the structure of the embedded data flow that I have in the data flow above. This approach does not seem to work as the flat file in the embedded flow never receives the name from the variable. The variable ($datafile_name) is a global variable that I attempted to populate in the query transforms that immediately follow the case transform.

I tried another approach without success - again, the variable that should populate in the flat file definition is not recognized and so the data isn't imported.

I would appreciate any input from the community. If I should rethink the entire flow, I am open to suggestions. If you can see a fault in my logic, please offer solutions. Thanks in advance!

former_member187605 · ‎2015 Apr 28

Why don't you add the filename to your data stream at the beginning of your flow? Set the Include file name column property to Yes.

Your variant with the embedded data flow will never work. An embedded data flow must come at either the beginning or at the end of a data flow and it is always linked to it thru a (virtual) nested schema source or target file.

By Category

Related Content

Activity Groups

Industry Groups

Influence and Feedback Groups

Interest Groups

Location Groups

Customer Only Groups

Forums

Related Resources

Products

Learning and Support

About

My Account

My Account

Read flat file within data flow using query transform

Know the answer?

Need more details?