I have a database with roughly 70GB of files. I need to pick these files, process their content and save them elsewhere.
That’s the easy part, however those 70GB are spread among 300k files which means that if I do it synchronously, I will have to wait for easily a couple of days.
I am working to optimize this code so I thought I would leverage some parallel execution, as I could process several files simultaneously.
I worry though, a Parallel.ForEach()
will quickly brick my RAM, how does it work when my list has hundreds of thousands of elements? It clearly cannot spawn Threads equal to the number of elements in the list or it will brick my RAM.
If Parallel.ForEach()
is not the answer I am looking for, what other options would I have to run custom code for each element in a list concurrently?