Using deployment slots for root cause analysis

I recently had a conversation with my friend Steven St Jean about interesting ways to use deployment slots.  If you are unfamiliar with deployment slots, you can read more about them here.  The short description is deployment slots are live web apps with their own hostnames that can be swapped into other slots with zero downtime.  Deployment slots are a very powerful feature and can be used in very interesting ways.  The topic that came up while talking to Steven was root cause analysis (RCA).  If you simply get the system back online without preforming RCA, all you have done is guarantee it will happen again because you did not fix anything.  RCA is a problem-solving technique used to identify the root causes of issues so they can be removed from the system.

At Ignite 2015, Brian Harry and I were interviewed by David Tesar on DevOps.  During that interview, we discussed having to change the culture of Ops from wanting to get the system back online as soon as possible after a failure to one of first ensuring we can do root cause analysis.

The number one most important thing is making sure I have captured the information that I need to root cause this so that it never happens again.

Deployment slots actually allow you to have the best of both worlds.  After a swap, the slot with previously staged web app now has the previous production web app. If the changes swapped into the production slot are not as you expected, you can perform the same swap immediately to get your "last-known good site" back.  By swapping the bad slot with a known good slot, you get your users back online quickly but can further investigate the root cause in the other slot.

Deployment slots also allow you to run a series of tests against the changes in your staging slot before they are swapped into production, reducing the likelihood of issues in production.

Using deployment slots with your Azure App Services web apps enables RCA while getting your users back online as soon as possible.

Add comment