I’d like to further extend Reading data form multiple csv file and writing it into one csv file using Spring Batch, in my case I’ve to read multiple files (with the same filename) from different sub-folder ../main-folder/year/Jan
, another path ../main-folder/year/Feb
etc for all years and create 1 file out of it, now while creating single file it should not exceed in 3GB or 3 millions record (which ever comes first), if more size or record count then create one more file and so on.
Code
<bean id="multiResourceItemReader" class="org.springframework.batch.item.file.MultiResourceItemReader" scope="step">
<property name="resources" value="file:.../main-path/year/*.*/*.csv" />
<property name="delegate" ref="flatFileItemReader" />
</bean>
<bean id="flatFileItemReader" class="org.springframework.batch.item.file.FlatFileItemReader">
<property name="lineMapper">
<bean class="org.springframework.batch.item.file.mapping.DefaultLineMapper">
<property name="lineTokenizer">
<bean class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer">
<property name="delimiter">
<util:constant static-field="org.springframework.batch.item.file.transform.DelimitedLineTokenizer.DELIMITER_TAB" />
</property>
<property name="names" value="field1, field2, field3 ......... field50" />
</bean>
</property>
<property name="fieldSetMapper">
<bean class="com.xx.xx.xx.SomeFieldSetMapper" />
</property>
</bean>
</property>
</bean>
The main challenge is how to put the validations on how to create multiple copies of 3GB or 3 millions records which ever comes first