I have data from Hive table and I want to export the data into several flat files, partitioned by rows, into equal/almost equal number of rows on each target.
Is there any way to partition data into several flat files dynamically like:
- output_01
- output_02
- output_03
and so on, with equal or almost equal number of rows on each target file using Informatica Developer (Big Data Management)?
I can describe how it should be done in PowerCenter or in Informatica Cloud, if you’re using any of them.
Short answer is:
- create
File Name
port in Traget Definition – this will make the dynamic file names possible - use
Transaction Control
transformation withTC_COMMIT_BEFORE
flag anytime the desired amount of rows gets processed – this will close the current file and redirect writes to new file (with name specified inFile Name
port value). This will create multiple files as a result.