Deployment
Saturday, January 24th, 2009(Skipping week in review for a long entry today…)
I’m continually amazed at how hard of a problem deployment actually is. If you’re going to be deploying any reasonably sized application you have an endless list of things to worry about:
- Taking the cluster up and down so there is no downtown
- Managing the configuration of individual nodes
- Operating system setup
- Installation of required libraries/3rd party tools
- Managing dev, QA, staging and production deployments
- Schema migration/database updates
- How to do rollbacks
We’ve done a bit of work with Galaxy to support deployment which addresses a small subset of these problems. Our NetBoot feature allows you to store your Mule application in a repository and have it downloaded on the fly from any number of nodes. You just trigger a restart on the node to get the application update via JMX.
There are a few other interesting tools out there.
Capistrano: Allows you to create Ruby scripts which automate all the aspects of your deployment. It looks endlessly flexible.
SmartFrog: A system for describing and managing software components. Its goals seem much more ambitious than Capistrano. Check out Steve Loughran’s presentation on deploying Hadoop on a cluster (a non trivial task) with SmartFrog for a good overview. It even comes with a management console! Although it looks complex at first glance, I bet that once you get the hang of it, it can simplify things quite a bit.
Puppet: A declarative language for “automating system administration tasks.” This seems much more oriented at automating sys admin tasks, than actually deploying your application.
(I would love to hear about anyone’s experiences with any of these tools or any of the commercial vendors as well.)
Given the complexity of deployment, it certainly makes PaaS offerings appealing. No worrying about operating systems, databases, 3rd party libs, configuring individual nodes… Ideally it could be such that you just upload your application and push it out. Which is what it seems many large companies are doing – Amazon, Google, Yahoo, and LinkedIn to name a few.
I hope that we start to see more core infrastructure managed by the infamous cloud people. Just write your app, upload, and tell it where to deploy. Then we can focus on building applications, which is what we really want to do anyway.
On a related note, are there any managed Hadoop instances out there? This could be a very useful part of a hosted application infrastructure. It’ll be interesting to see if Amazon supports something like this as a core service someday.