Is there any way to further optimize the speed of reading files(resources) from MongoDB?
I executed the process in parallel, but it didn’t add any advantage to me at all.
GridFSFindIterable files = gridFsTemplate.find(Query.query(Criteria.where("_id").in(reports.stream().map(report -> report.getMongoReportId()).toList())));
List<GridFsResource> resources = StreamSupport.stream(files.spliterator(), true)
.map(file -> {
try {
return new GridFsResource(file, gridFsTemplate.getResource(file).getInputStream());
} catch (IOException e) {
throw new RuntimeException(e);
}
})
.collect(Collectors.toList());
resources.forEach(resource -> {
try {
String reportStr = new String(resource.getInputStream().readAllBytes());
....
} catch (IOException ex) {
log.error("Cannot load report", ex.getMessage());
}
});
Speed up without having to buy super expensive disks. I have NVEM and the read speed should be decent.
I have report files in the DB that I need to analyze. A few files don’t cause any problems.
But I started testing how the application behaves when 70-100 files are analyzed. Each file is 500Kb – 4-5MB in size.
In the case of reading 70 files it takes 3-5 seconds, which is quite long(I think). Since in the plans we need to be able to read 100 – 10000 files quite often.
How can I optimize in this case Mongo? Or do I need to look towards some other solution? Keeping DB in memory will not work, as there will be hundreds of gigabytes of data.