I recently had a conversation with my friend Steven St Jean about interesting
ways to use deployment slots. If you are
unfamiliar with deployment slots, you can read more about them here. The short description is deployment slots are
live web apps with their own hostnames that can be swapped into other slots
with zero downtime. Deployment slots are
a very powerful feature and can be used in very interesting ways. The topic that came up while talking to
Steven was root cause analysis (RCA). If
you simply get the system back online without preforming RCA, all you have done
is guarantee it will happen again because you did not fix anything. RCA is a problem-solving technique used to
identify the root causes of issues so they can be removed from the system.
At Ignite 2015, Brian
Harry and I were interviewed by David
Tesar on DevOps. During that interview, we
discussed having to change the culture of Ops from wanting to get the system
back online as soon as possible after a failure to one of first ensuring we can
do root cause analysis.
The number one most important thing is making sure I
have captured the information that I need to root cause this so that it never
happens again.
Deployment slots actually allow you to have the best of both
worlds. After a swap, the slot with
previously staged web app now has the previous production web app. If the
changes swapped into the production slot are not as you expected, you can
perform the same swap immediately to get your "last-known good site"
back. By swapping the bad slot with a known
good slot, you get your users back online quickly but can further investigate
the root cause in the other slot.
Deployment slots also allow you to run a series of tests
against the changes in your staging slot before they are swapped into
production, reducing the likelihood of issues in production.
Using deployment slots with your Azure App Services web apps
enables RCA while getting your users back online as soon as possible.