I am considering using a NoSQL Document database as a messaging queue.
Here is why:
- I want a client to post a message (some serialized object) to the server and not have to wait for a synchronous response.
- I want to be able to pull messages off of the “queue” based on some criteria, which may be more sophisticated than just a priority level (I am working on a hosted web app, so I want to give all of my customers a fair amount of “computing time”, and not let one customer hog all of the processing).
- I want the queue to be durable – if the server goes down I want any remaining messages to be handled when it comes back up.
So, I am considering using MongoDB or RavenDB as a message queue. The client can post the message object to a web service which writes it to the database. Then – the service doing the work can pull the various message types based on any criteria that may arise. I can create indexes around the scenarios to make it faster.
So – I am looking for someone to shoot a hole in this. Has anybody successfully done this? Has anybody tried this and failed in some way?
2
Check out the accepted answer on this question: https://stackoverflow.com/questions/4745911/nosql-databases
My take on having worked with both types of databases is that the real advantages of NoSQL lie in their scalability. They are well suited for ever-growing blobs of stuff that needs to exist on many nodes. After all, these are the applications that they were born out of (Facebook, Google…).
They have downsides also, and they are specific to implementation. Personally, I’ve suffered with some replication errors when multiple nodes would delete and re-populate objects within a short amount of time. I’m not necessarily suggesting that it is always pervasive, but the speed advantage often comes with less guarantees of consistency (ie, you will have eventual consistency, but you don’t want to depend on it).
If all you’re doing is building a queue, then I don’t see anything specific to NoSQL that makes them a preferred choice. The speed/reliability/efficiency of it will come down more to the configuration of whatever implementation it is that you decide to go with.
There are no visible holes in it as your requirements list is pretty short :-). Basically the longer the requirements list the bigger the chances to find holes in writing your own.
In my opinion, using a NoSQL database for this scenario would fit:
- if the requirements are not for a full featured queue
- if the app will not have to move from the pull model to a push model (queue v pub/sub)
- the structure of the messages is pretty variable and changes over time
- the app needs to pull messages based on different criteria
- reusing the NoSQL database would reduce the number of systems the app would depend on
As a side note, I’d (biasedly) encourage you to also take a look at RethinkDB.
I agree with MrFox. You need to consider that if you have several threads updating the same data in the queue, your database must support true ACID transaction or you will risk the same item in the queue to be processed more than once, besides getting duplicates of the same item in the queue.
If the total data that you post to the queue is not in big data size (>PB) of data I would advice on selecting another type of database, at least a database that supports true consistency.
The process queue will be better suited for an OLTP type of database since you are basically doing more inserts/updates rather than anything else.
1