We are starting rewritting our apps (Internet portal, millions of unique users and few CRM/ERP systems, few hundred users) and we have a huge decision to make now. We are going to write them mostly (90-95%) in Symfony2
with Doctrine
, and some background services (e.g. mailing) in Java
. Database – MySql
/MariaDb
. also lot of additional technologies (redis/memcached
, load balancing
, varnish
, replication
and so on). Most important (in this case) are – symfony2
, mysql/maria
and doctrine
.
The thing is – it will be best for few systems to work on same tables. For example: internet portal with job offers + CRM system for managing clients that pay for posting those offers (there are many similar cases). Also having functionality of one login for our users between every system is important.
I see two approaches here:
- We have one big database, having few hundred of tables. I used to work with 200+ tables, but not with that high amount of traffic. So if traffic goes up, sharding/partitioning will be involved.
- We have many databases, each for one app. If there is need for one app to communicate with another
DB
, we will write special services to deliver that functionality.
Now, what I’m worried about:
1. ease of development – it’s easier to just have one database
2. ease of configuration/assuring redudancy
3. performance
One thing I know now that database will be hosted on three machines with master-slave
replication, which is supported nicely by Doctrine
.
What are your thoughts? What are your pro’s and con’s? Thanks!
1
- 1 Database, mirrored and clustered and partitioned as required and serving stored procedures as an API to:
- A couple of App servers, that talk to the the DB and provide a domain-neutral API to:
- Many web servers that clients interact with.
This is the way to achieve scalability. You offset some processing to the tiers, so web servers handle all the presentation of the html pages, given data that has been retrieved from the DB and combined or otherwise massaged into shape by the application tiers. Generally this scales well horizontally (ie need more scale, add more web or app servers).
The load on the DB can be reduced with this architecture – the DB solely serves raw data to the app tiers. This means the app tier can provide caching of some data to further reduce the load on the DB, as there’s only a few of them you don’t need a full-on caching mechanism to keep the caches in sync.
You can also introduce separate DBs if the data is easily partitioned and the clients wouldn’t even know you’ve done it as they talk to the app tier. This is probably the biggest win, you can change things without re-deploying everything.
4