The Apache file component loads the whole file into memory when the file is picked up by the route statement from(file://dir/largefile.txt)
. The issue is the file is of size 300MB whereas the Kubernetes pod has only limited memory which will result in an out of memory error when loads the huge file.
After loading the file, split will be used to process the lines in the file.
.split(body().tokenize("n")).streaming()
The issue is how to make the file component not to load whole file into memory at first place.
If there is any other approach to work with Apache camel to achieve this is also fine.
Did you try using a streamcache, which uses temporary files, and doesn’t load the whole file to process it ?
Something like :
@Override
public void configure() throws Exception {
// Enable StreamCache
getContext().setStreamCaching(true);
getContext().getStreamCachingStrategy().setSpoolThreshold(1024L * 1024L); // 1MB threshold
from("file://dir?fileName=largefile.txt&noop=true")
.split(body().tokenize("n")).streaming()
.process(exchange -> {
String line = exchange.getIn().getBody(String.class);
// Process each line
System.out.println(line);
})
.end();
}