I got a java-ee application, where I collect informations about movies. Im my backend I provide data like the name, description, genre and a random uuid.
I also got lots of related files, which are stored on a file server. Including some screenshots, the dvd or bluRay cover and video trailers.
My current approach is:
When saving the files to the fileserver, I retrieve the movies random uuid (which is the primary key btw.). I then rename the files screenshot_[UUID]_1
, screenshot_[UUID]_2
… etc.
Now, there are lots of other ways to handle this, like saving all filenames in a database or creating a dir structure on the fileserver for every uuid and, e.g., return all images in the “[uuid]/screenshots” folder via REST.
I expect about 30k requests a day, so the service has to be pretty performant.
Whats the best way to solve this?
1
The best way to solve this is to forget about any micro-optimizations like “use of a directory structure” vs. “each file name containing all the UUID” (though you should check if there is a limitation in the file system you are going to use about the number of files in one directory). Implement a simple solution which fits well into your existing infrastructure, create some sample requests and measure if the solution is fast enough. If not, measure which part of your solution is exactly too slow, then improve that part. Everythig else is most probably premature optimization.
As usual, the answer is “it depends”. There’s lots of different ways you can do it, each solution will vary depending on your unique situation.
I have found that windows tends to barf at any more than 1000 files per directory, so in the past I have created a folder per 1000 files. But you could split it by starting character (eg alphabetically)… or create a hierarchical tree, or split it across drives, or use SQL Server’s FILESTREAM… and then you’ve done lots of premature optimisation and wasted everyone’s time!
Try a few scenarios and throw lots of random files at it and see how it performs.
Or just throw everything in one folder and buy lots of SSD drives 🙂