Saturday, March 14, 2009

A dozen things Deployment Scripts must do

Just like the barefoot kids of the local cobbler I find the installation of many software packages ironic and painful.  Lately I've been talking, a lot, to development managers about getting their deployments scripted and to business people about why Fast, Easy, Reliable deployments are important to them.

Strangely, there seems to be a real lack of understanding on the part of developers about what a build script should do.  I blame IDEs for allowing developers to mostly ignore how things get built and deployed - but I could be wrong.

Anyway,  I've written a list of generic requirements for an automated deployment and this seems to have helped several folks focus their efforts so I thought I would share.  I have gotten some feedback that this list is 'pie in the sky' but I reject that while allowing that some of these things are not all that easy.

An automated deployment should:

  1. Start with a single package.
  2. Reliably put the target environment into a known state.
  3. Ensure that required disk space is available.
  4. Save anything needed for rollback.
  5. Place all files where they belong.
  6. Register anything that needs to be registered.
  7. Configure anything that needs to be configured.
  8. Start everything that needs to be started.
  9. Validate the state of the application.
  10. Validate connectivity to all required resources such as databases.
  11. Report Success or failure and the name of the artifact you have deployed.
  12. Log all relevant deployment activities.

Starting with a single package is important because it's a lot easier and less error prone to move one jar file than a jar file, a couple of properties files, a config.xml etc.  Plus you can give the package a name and store it someplace for later.

By "Reliably put the target environment in a known state" I mean just that.  If you need to shut down an application server you should also search for and kill any hung processes.  If there are status flags, set them to the correct value; basically clean the kitchen before you start cooking.

Nobody wants to get part way through a deployment just to run out of disk.  Check it first.  This is a no-brainer, but often overlooked.

Save your old stuff so you can drop back and punt if needed.  I would urge you to save it into a single package just like the one you're deploying.  I don't think you should rely on going back to the old package as there may be required changes in the current environment.  Those changes should have been the result of a proper deployment, but stuff happens...

Place your files.  I usually call this sprinkling - the deal is that your script needs to put every file where it belongs, using an appropriate path reference.  If it is always going to be the same in relation to the root directory, I like fully qualified path names.  Otherwise I like paths names relative to the installation directory - hopefully all within the installation directory.

Register things that are registered.  I don't know what you might be registering, but get it done here.  Also, start from scratch each time.  If you rely on something already being there, at some point it won't be, so trust nothing. 

Configure things that need to be configured - whatever they are - same as above, start from scratch.  Rather than sending a deltas for a config.xml send the whole (version controlled) file with the contents you want.

Start everything up.  Do this in the script rather than expecting the guy doing the deployment to do it manually.  There's no telling how many hours of otherwise useful human activity have been squandered on troubleshooting just to find that something did not get started when it should have.

Having started everything, go look to see that it is really running.  Seems simple because it is.

Validate connectivity.  If you need it for production make sure you can talk to it and that it answers.  This does not have to be complex, a simple select statement, ping or whatever makes sense for the end point you need.

When something is completed, say so, when something fails, make some noise.  Let the poor guy doing this in the middle of the night know what's going on.  Also, reiterate what version has just been deployed; I have seen people go through a long deployment and troubleshooting session just to find that they deployed the stuff from the previous release.

Build a log as you go.  Tell me when each thing starts and stops and what you did.  Do not dump a bunch of useless noise in the log though.  It should read like a detective's notebook.  
  • 22:17 Copied Blahblah1.2.jar to...
  • 22:20 Change directory to /user/loc...
  • 22:20 Executed jar - xf...

Above all put yourself in the shoes of somebody that has to deploy your app along with 20 others, in the middle of the night, while taking calls from people that want status, and trying to grab a bite to eat.

Thanks for reading - Mike


Vaughan said...

I pretty much agree with what you are saying. From an implementation point of view I think that your first bullet needs to be to ensure all prerequisites are met. The biggest pain points I have had with deployments are connectivity needs and credentials to your backends. Secondly coding errors.

Packaging everything up into a nice little easily moved bundle is nice on deployment day but can also have it's downfalls.

Typically it is the responsibility of the deployer to verify the data and methods being deployed. I have seen where deployment scripts log passwords or credentials plain text to deployment log files which is against standards (pick one: SOX, PCI, HIPPA etc) but certainly against best practice. Bundling/scripting deployments allows laziness on the part of the implementers.

Not the subject of this blog I know. Just the two cents of a key punching monkey system administrator.

Mike Coon said...

Vaughan, the point I am trying to make is not WHO should do what as far as the scripting and deploying but WHAT should be done. In many organizations there is a build team that builds from source code, and also does the deployment scripting. In other places the System Administrators would do the scripting.

What I have issue with is poorly defined and/or error prone deployments that result in problems unrelated to the actual changes. Stuff like not staging a config file or having the wrong name on a datasource because it is edited each time we go to prod.

We're, theoretically, the smart folks that automate drudgery out of peoples lives. We should also make sure the tedious, routine, and error prone parts of our work are done by computers rather than wasting brain power over and over again that could be better spent on something else.

Scott said...

Pretty accurate list. Sad to say I seldom run into a company that is doing half of them. I would possibly change some ordering around a bit. Like checking whether the new DB connection even exists before pushing files out. Then you can make a decision if you even want to deploy artifacts or not.

I am all for bailing and cleaning up when it goes wrong as early as you identify it is bad so as to mitigate possible collateral damage. Keep things an atomic activity where it makes sense. That way if the DB update/insert fails you don't push code depending on it and start a roll back action.

Generally chatty deployments are much easier to troubleshoot where they went wrong.

As the other commentor noted the password issue is always there but I think that is more of an app. architecture issue than deployment. I know it sometimes falls to the deployment but done "properly" there should be no need anywhere in the code for a cleartext password.

Alex said...

Hi Mike,

AccuRev asks: What's your Automation Index?

More to come...