I am importing a lot of data from a csv, transform it to list of objects and write it to the database. This is working fine but it seems to have a memory leak.
The problem is that the memory does not get released after the import is done. It is stacking up everytime the import is running until the application runs out of memory. When I run the garbage collector manually (with jcmd GC.run) the memory is released.
I don’t want to run the garbage collector in the code when it’s not necessary. I’d rather understand where the flaws of my code is and why the programm can’t clear the memory itself.
My code:
fun importData() {
logger.info("Import started")
transactionTemplate.execute {
logger.info("Loading Csv1...")
val c1 = importCsv1()
logger.info("Loading Csv2...")
val c2 = importCsv2()
logger.info("Loading Csv3...")
val c3 = importCsv3()
logger.info("Deleting Tables...")
c1Repository.deleteAllInBatch()
c2Repository.deleteAllInBatch()
c3Repository.deleteAllInBatch()
logger.info("Persisting C1...")
bulkObjectRepository.persist(c1)
logger.info("Persisting C2...")
bulkObjectRepository.persist(c2)
logger.info("Persisting C3...")
bulkObjectRepository.persist(c3)
logger.info("Import finished")
}
//do some stuff here that must be outside transaction
}
And the BulkObjectRepository
:
@Service
class BulkObjectRepository(
@Value("5000") private val batchSize: Int,
private val entityManager: EntityManager
) {
fun persist(entities: Iterable<Any>) {
val iterator: Iterator<Any> = entities.iterator()
for ((i, obj) in iterator.withIndex()) {
if (i > 0 && i % batchSize == 0) {
entityManager.flush()
entityManager.clear()
}
entityManager.persist(obj)
}
entityManager.flush()
entityManager.clear()
}
}