I’ve just started at a new job this past month and looks like they have NO source control for their code. They are relying on the backups their hosting provider takes for them.
After talking a bit I convinced my boss that we should definitely be using source control and after I gave a short seminar on it, the entire team is on board; they loved Mercurial.
So right now this is the way we work:
º----------BitBucket
º---------/
º--------/
Myself and the three other developers hg pull
from BitBucket, make our changes, then hg push
to BitBucket.
Now for deployment, someone would need to FTP the latest files towards the production server.
I was thinking of installing Mercurial on our server, and using hg clone
(subsequently hg pull
) to keep the versions up to date on production.
º---push->-----BitBucket----<-pull-----º (production server)
º---push->----/
º---push->---/
Is this a good idea? Any potential pitfalls I may not be seeing? Has anybody here done something similar? How do you deploy a large PHP framework application (We’re using Moodle)?
2
This is certainly a good idea and it is a common method to use for deployment. You might want to use a stable branch for deployment purposes whilst keeping trunk for ongoing development so that you can test the stable branch before you deploy it into production.
The only problem can come when you have sensitive information in your code base (such as API keys etc) that you do not wish to upload to third party servers (in your case that would be Bitbucket). In this case a simple script that is run once you have pulled the data from the repository to restore the sensitive data in the correct place will solve that issue.
Mind that this deployment strategy is not atomic. It might happen that some files are already updated while other files might still be in the old state while the application is being hit. This can cause unexpected side effects.
A way to do atomic deployments is by i.e. using symlinks. Create a directory containing the new files. when everthing is ready change a symlink for the directory used. If you keep the old version around you can also easily rollback.
2
Another (in my opinion better) possibility: use a build server / continuous integration server.
Short short explanation: this is a server (can be in-house, doesn’t need to be on the internet) that you set up to monitor your repositories, and whenever there are new changesets in the repos, the server builds your code (AFAIK this is not necessary in PHP), runs unit tests and deploys your code to the web server.
For more information, check these links:
- Wikipedia article about CI
- Stack Overflow question: What is the point of a “Build Server”?
There are a lot of different products for CI out there, but the only one I’ve used so far is TeamCity. Very easy to set up…in fact, it’s the first one I tried, and I liked it so much that I sticked with it.
Alternative cheap solution:
If setting up a build server is too much effort or if you’d like more control about when exactly your site is deployed, just set up a script file (Batch/Powershell on Windows, or something similar on Linux/Mac) that pulls the newest version from your repository and FTPs it on the production server.
Basically, it’s the same as a build server, only simpler.
No matter how exactly you solve it in the end…be sure to automate it somehow!
You want to be able to deploy with a single click/by typing a single command, so that EVERYBODY can do it without having to know anything special, and without making mistakes – even in case of a disaster or under stress.
We do this, or things similar to this. The nonatomic @johannes mentions angle is one issue, though in realistic terms it happens so fast it should be OK and there are ways around this as he points out.
Probably more important than this non-atomicity would be “how do you manage database schema updates” — deploying bad code this way makes fixing that easy. The big issue is when you deploy an update that changes the database that you wish to roll back. Or if you are making bad updates and corrupting data.
The other issue we had with DCVS tools (as opposed to using SVN) is that you now have a copy of the entire codebase on the machine someplace an attacker could potentially grab. And also that that DCVS codebase can get pretty heavy size-wise, which could matter if you are paying for storage and/or backup. We are still using SVN for the final deployment for these reasons.
It is a great idea, however keep in mind the following:
- Try not to commit on the server (although some rare times it makes sense to do this, e.g. installing a plugin or adding content assets)
- Use a staging server or a secondary repository deployment for testing
- Be always careful that
hg update -C
doesn’t affect production (i.e. delete important files) - Have a production and a development branch, only deploy the production branch
- Treat assets as backup (e.g. images for content,) and ignore user data (e.g. attachments/uploads, cache, etc)
- Always have a clean
hg status
output on the server (this will help you make sure you are ignoring things as cache) - Don’t deploy the repository in the web folder. Use symlinks from outside the public space (e.g. ln -s /myrepo/src/web /public_html/myapp)
- Be careful not to version configuration files (specially with database passwords or other)
- Don’t use instead of a production backup, this is a development backup for production code, not production data
Finally, I think the most valuable thing for adding a DVCS to your deployment process is that this will add security to your deployment, sometimes hackers inject malicious code to your stuff and you really have no means of detecting it easily without something like version control (specially distributed, since the distributed aspect to VCS makes it easier to check for integrity of your files).
I’ve had some sites gotten hacked a few times, having Mercurial helps me litterally undo this hacks by just issuing an hg update -C
in the server (of course you might want to do an hg status
and get the affected files for later analysis).