I am trying to run a large amount of sites which share about 90% of their code. They are simply designed to query an API and return the results. They will have a common userbase / database but will be configured slightly different and will have different CSS (perhaps even different templating).
My initial idea was to run them as separate applications with a common library but I have read about the sites framework which would allow them to run from a single instance of Django which may help to reduce memory usage.
https://docs.djangoproject.com/en/dev/ref/contrib/sites/
Is the site framework the right approach to a problem like this, and does it have real benefits over running separate applications?
Initially I thought it was, but now I think otherwise. I have heard the following:
Your SITE_ID is set in settings.py, so in order to have multiple
sites, you need multiple settings.py configurations, which means
multiple distinct processes/instances. You can of course share the
code base between them, but each site will need a dedicated worker /
WSGIDaemon to serve the site.
This effectively removes any benefit of running multiple sites under one hood, if each site needs a UWSGI instance running.
Alternative ideas of systems:
- https://github.com/iivvoo/django_layers
- https://github.com/shestera/django-multisite
I don’t know what route to be taking with this.
5
Why are you optimizing?
You seem to be concerned about memory usage. Why? Is the projected savings large enough to let you rent a cheaper VM? Is the sites’ current usage threatening to bust your machine’s current RAM? (And even if they are, are you confident that your going to spend less on development time for this feature than you would for just getting a better machine?)
Let’s assume you already asked yourself that…
And the unfortunate conclusion was that there is a compelling reason to optimize memory usage.
You are talking about putting in work and code infrastructure to link these sites into a single, optimized unit. If these separate sites are inextricably linked to each other in some way other than sharing the same code base, it makes sense to build physical infrastructure on top of that logical grouping. If these are functionally separate sites that just happen to use mostly the same codebase, I would advise to not even think about using a framework like this unless you’ve got no other choice.
What if one of the sites later wants to pick up a windows dependency (assuming the others are running on *nix right now)? What if one of the sites becomes so popular that it needs to be on a separate machine for performance reasons? What if the customer wants to relocate one of the sites for better response times in Europe?
Without any other details about what you are working on, creating an architecture that ties these sites together sounds like a pretty questionable endeavor.
I suggest putting your code into git, then using submodules to load your ‘library’ of shared code.
If you read up on it, this sounds like exactly what you need:
http://git-scm.com/book/en/Git-Tools-Submodules
It often happens that while working on one project, you need to use another project from within it. Perhaps it’s a library that a third party developed or that you’re developing separately and using in multiple parent projects.
We use this extensively and has worked well for our organization that does hundreds of deployments weekly of the same code with very minor changes to each version that gets deployed.
My 2 cents.
Django will help you in that manner. You will redirect multiple sites to your Django instance, and the Django instance will use the API that can help you with saving through economies of scale, with VM costs/ API subscription costs, etc.
In this context, I am assuming that you are not using limited free tiers, or you do not have complicated SLAs with your service providers (like the fair use of resources). Before redirecting all your sites into one Django instance (which will work on one server, and will use one API), please check your agreements that you do not have penalties or upper thresholds.
One other aspect you need to consider is the user profiles and separate workloads of your sites. If overall workload will be too much, you will eventually go for a load balancer which will increase your number of instances. On the other hand, since it will be the same implementation working behind, development of the Software will be easier.
For a precise answer, you need to investigate metrics, check your available resources and focus on the business case (the drive behind your optimization needs). Unfortunately, this is more art than science.
Umut.
My suggestion would be to use nginx, uWSGI, and flask. nginx is robust and works well with uWSGI in Emperor mode. Beyond that Flask is super flexible for python coders so you should be able to quickly make the modifications necessary to take advantage of your shared code base.
This blog entry should get you started.