Archive for January, 2009

Deployment

Saturday, January 24th, 2009

(Skipping week in review for a long entry today…)

I’m continually amazed at how hard of a problem deployment actually is. If you’re going to be deploying any reasonably sized application you have an endless list of things to worry about:

  • Taking the cluster up and down so there is no downtown
  • Managing the configuration of individual nodes
  • Operating system setup
  • Installation of required libraries/3rd party tools
  • Managing dev, QA, staging and production deployments
  • Schema migration/database updates
  • How to do rollbacks

We’ve done a bit of work with Galaxy to support deployment which addresses a small subset of these problems. Our NetBoot feature allows you to store your Mule application in a repository and have it downloaded on the fly from any number of nodes. You just trigger a restart on the node to get the application update via JMX.

There are a few other interesting tools out there.

Capistrano: Allows you to create Ruby scripts which automate all the aspects of your deployment. It looks endlessly flexible.

SmartFrog: A system for describing and managing software components. Its goals seem much more ambitious than Capistrano. Check out Steve Loughran’s presentation on deploying Hadoop on a cluster (a non trivial task) with SmartFrog for a good overview. It even comes with a management console! Although it looks complex at first glance, I bet that once you get the hang of it, it can simplify things quite a bit. 

Puppet: A declarative language for “automating system administration tasks.” This seems much more oriented at automating sys admin tasks, than actually deploying your application. 

(I would love to hear about anyone’s experiences with any of these tools or any of the commercial vendors as well.)

Given the complexity of deployment, it certainly makes PaaS offerings appealing. No worrying about operating systems, databases, 3rd party libs, configuring individual nodes… Ideally it could be such that you just upload your application and push it out. Which is what it seems many large companies are doing – Amazon, Google, Yahoo, and LinkedIn to name a few. 

 

I hope that we start to see more core infrastructure managed by the infamous cloud people. Just write your app, upload, and tell it where to deploy. Then we can focus on building applications, which is what we really want to do anyway.

 

On a related note, are there any managed Hadoop instances out there? This could be a very useful part of a hosted application infrastructure. It’ll be interesting to see if Amazon supports something like this as a core service someday.

 

Things I miss when you don’t use Maven or publish to the repo

Sunday, January 18th, 2009

 

  1. Being able to depend on a project without having to download, unzip, locate jars, locate dependencies, and add them all to my classpath
  2. Automatically getting the sources attached to my IDE once I depend on it (meaning I don’t have to check it out from SVN…)
  3. Automatically getting the javadocs attached to my IDE once I depend on it (meaning I don’t have to check it out from SVN…)
  4. Knowing exactly how to setup a project in the IDE once I download it’s sources
  5. Standard commands to build and test and deploy
  6. Modularization. This isn’t Maven specific per se, but on the whole, people who use ant don’t modularize their build and end up with one ugly amalgamation of source code. 
I’ll also add that it seems very duplicitous to pull down dependencies from the Maven repository via Ivy and not publish dependencies/POMs to the Maven repository.
This entry was insipired by some tinkering with Hadoop (and subprojects) for those who were wondering…

 

Week In Review: SOA, Scalability, Twitterpated

Friday, January 16th, 2009

Seeing that I’m not inclined to write a whole series of entries, I’m going to give a new format a go: summarizing some random thoughts from the week on different topics in one blog entry. Basically I’m lazy and this makes me feel better about writing short snippets because I can combine them and make it one long post. Don’t expect much value add.

Twitter

Seems I’ve given in and become an active twitter user. My ID is dandiep. The conclusion so far is that people seem to enjoy drinking and tweeting about it. I’m drinking a Ale to the Chief right now, I suppose I should go let the world know…

Scalability

I’ve been looking at some of the cool stuff coming out of companies that have big scalability requirements. The first project that I was pointed to recently is Cassandra. I’m guessing I’m way behind the times because I had never heard of it. It’s the P2P storage platform that they use at Facebook. Will someone please add some friggin docs to this thing?

The second comes via Geir. It’s an Amazon Dynamo implementation called Voldemort. It was developed by LinkedIn (primarily/all by Jay Kreps?) and appears to be pretty simple to use. On the flip side it is very simple with just keys and values. Cassandra definitely goes beyond that and allows you to define schemas and indexes of sorts.

Are people using any tools to help them deal with the added code complexity of these non relational databases? Is there an opportunity for a framework to help deal with constraints, eventual consistency, etc on top?

Mule

Did you know Mule has a blog? Of course other people blog about us too. Jerome from Noelios just blogged about our Restlet connector. Restlet is pretty cool and I think the URI template routing insnide Mule can come in pretty handy…

SOA

Is it dead? How can something be dead when I wasn’t sure what it was to begin with?

I would prefer to kill the term. I would group part of what people refer to as SOA as systems design. Thinking about a problem wholistically. This is just part of good software engineering. For instance, the guy who has 6 services which do the same things for different parts of the company. Investing to consolidate these services will pay off. Investment and return. ROI. Why do we need to call this SOA?

Also, you cannot separate this from the technology. How am I going to achieve a shared service which needs to be monstrously scalable? The outcome of this decision probably has big effects on your process.