I use Spring Batch
in order to copy data between the same collection in different Mongo
databases. As given in the following batch config, I use 2 steps; first delete the existing records and then insert records based on the query. I am sending 2 jobParameters from a service as “collectionName” and “query” and get these parameters from reader and writers.
BatchConfing:
@Bean
public Job job(JobRepository jobRepository,
@Qualifier("transactionManager") PlatformTransactionManager transactionManager,
@Qualifier("sourceMongoTemplate") MongoTemplate sourceMongoTemplate,
@Qualifier("targetMongoTemplate") MongoTemplate targetMongoTemplate,
@Qualifier("deleteTasklet") DeleteTasklet deleteTasklet) {
return new JobBuilder("job", jobRepository)
.incrementer(new RunIdIncrementer())
.start(deleteStep(jobRepository, transactionManager, deleteTasklet))
.next(insertData(jobRepository, transactionManager, sourceMongoTemplate, targetMongoTemplate))
.listener(new MigrationJobCompletionListener("job"))
.build();
}
@Bean
public Step deleteStep(JobRepository jobRepository,
PlatformTransactionManager transactionManager,
DeleteTasklet deleteTasklet) {
return new StepBuilder("deleteStep", jobRepository)
.tasklet(deleteTasklet, transactionManager)
.listener(new MigrationStepListener("deleteStep"))
.build();
}
@Bean
public Step insertStep(JobRepository jobRepository,
PlatformTransactionManager transactionManager,
MongoTemplate sourceMongoTemplate,
MongoTemplate targetMongoTemplate) {
return new StepBuilder("insertStep", jobRepository)
.<Document, Document>chunk(DEFAULT_CHUNK_SIZE, transactionManager)
.startLimit(DEFAULT_LIMIT_SIZE)
.reader(new MigrationItemReader(sourceMongoTemplate))
.processor(new MigrationItemProcessor())
.writer(new DataInsertionWriter(targetMongoTemplate))
.listener(new MigrationStepListener("insertStep"))
.build();
}
Now I am trying to process multiple collection by sending collectionName” – “query” pairs from service and then iterate each pairs in the job method. How to build such solution and is there a better solution for this scenario? As I have only several collections (maximum 4 or 5), I think currently no need to use paralel execution and can process each collection one by one.
Here is a sample pair creation in the service for this purpose:
Service:
public void runJob() {
Map<String, String> pairs = new HashMap<>();
pairs.put("employee", "{}");
pairs.put("department", "{}");
// convert to JSON string
ObjectMapper objectMapper = new ObjectMapper();
String jsonPairs = objectMapper.writeValueAsString(pairs);
// set JobParameters with properly encoded JSON string
JobParameters jobParameters = new JobParametersBuilder()
.addString("collectionsAndQueries", jsonPairs)
.toJobParameters();
// Launch the job with the built JobParameters
batchJobLauncher.launchJob(jobParameters);
}