The requirements are such that the database must provide fast reads and allow for storing petabytes of data.
There will likely be two databases. Both are to store all the events occurring in the system (millions per day) related to both customers and employees. One database should hold summarized data and provide reads based on a few predefined attributes. The other should hold full detailed data and allow reads based on many different queries (flexible approach) and provide analytics.
For the first database, I’m initially thinking about Cassandra or Google’s BigTable.
For the second, Elasticsearch or Google’s BigQuery might be suitable.
Cost is not an issue.
Could someone more experienced in such areas give more insight?
Thanks in advance 🙂