I am new to Spring WebFlux, lately i have encountered OOM in a Spring WebFlux project.
Inside this project, there is only one scheduled job runs one time per day.
The job contains the following actions:
Scan all items from a dynamoDB table (there are 1 million items in this table and each item has 40kb)
Transform each item into a new Object
Upload these transformed objects into AWS S3 bucket as one single json file
After reviewing the dump file, the root cause is too many Map<String, AttributeValue> objects not being released. Because the amount of items from the table are too large so it keeps scanning items(and not releasing them i guess) and eventually the heap area is overloaded and throws OOM.
pseudo code:
pseudo_code1
pseudo_code2
dump_file
I have tried other approaches to process the data in chunk by using Mono.buffer() or by using AtomicInteger class to control the thread keeps scanning items. But none of them works.
My expectation is: can this job keeps processing items in chunk (e.g. 1000 at a time)? Like:
fetch 1000 items -> transform 1000 items -> multipart upload items -> end of chunk process -> start another fetch 1000 items …
and make sure the memory release after the each sub-process
JIA ZENG is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.