I’m a bit green to web applications although I am in the final phases of developing one for a client. I’m using Django with Gunicorn/Nginx on an AWS m1.medium. The database (MongoDB) is on a separate instance. The client is paranoid about scaling and so I threw together a crude monitoring server which has the ability to spin up new AWS instances, install the app code and load balance (I know, I know, I could have used ELB. I said I was green, and it was fun writing it anyway).
The question is, I don’t really know what metrics I should be aware of. How will I know when my app server is under ‘high load’? CPU? RAM? Request latency? All of the above?
Any guidance in this area would be appreciated.
High load is too high when either software is unstable, or users start to notice latencies, whichever comes first. You should know your usage pattern, and build load tests based on it.
Pick the bottleneck factor (cpu, ram, io) and fire up instances at 80% of the peak value, to handle spikes. Also you start/shutdown them before busiest/most idle times of the day.
3