I’m trying to come up with a sane architecture to delay a task for 24 hours. So far I’ve thought of:
- unix cron job, but this will not be able to take the load we give it, since we will have many thousands of delayed tasks per day
- Use RabbitMQ with dead letter exchange, however, this is a hack and being a hack is riddled with problems – impossible to monitor without mutilating the messages, etc.
- Other solutions I’ve come up with are poll based, e.g., insert task into MySQL and have a poller responsible
- Amazon SQS only supports up to 15 minute delays. I could write a punt loop for # of delays, btu this is already reinventing the wheel way too much. Plus using SQS this would get expensive as we pay per message.
- Redis possibly…? have not heavily investigated this. I know Memcached is not appropriate as it won’t really guarantee persistence that long.
What existing technology is good for this? I don’t want to reinvent the wheel here, given that I’m pretty sure I’m not working on the first software product that requires performing a task tomorrow. Thus far I’ve been database driving things.
1
I believe your first instinct is the best one – in that it is where you should look for the answer. The early versions of cron was a polling approach (every minute, check to see if anything should be run). A later version made use of a discrete event simulator. The article that lead to this version of cron is titled “An efficient data structure for the simulation event set”. Reading this (it is several pages long) may help you in understanding of the algorithm and how to reimplement it yourself.
At any given time, you know all of the events that will happen within the next 24 hours. Each time the process runs to deal with something, it looks at the head of the queue to see how long it needs to sleep – it then sleeps that long, and launches another thread or process to deal with that job, and repeats (read the head, sleep, fork, read head, …).
Internally, this is how cron works.
The question then becomes how do you want to update and persist this data structure. Writing the data to a database or to a file on a file system seem to be the two best approaches for this – being ways that are simple to access from a multitude of different approaches.
I would not try to persist the data in a message queue as it really isn’t meant for that.
I don’t have much experience with job scheduling and batch processing, but I do know Quartz is an enterprise level tool that let’s you schedule thousands of jobs if needed and it lets you assign system resources (cores/threads), etc. You might want to have a look at it if you are willing to create (or already have) a Java solution.
1