Disaster Recovery: What to do when the lights go out
What do you do when the power cuts out midway through your first coffee on a random Tuesday morning? If you’re the IT Manager you open up your disaster recovery plan of course! Because you do have one of those, right? Right? And because you’ve planned ahead it’s no doubt up-to-date, printed out on actual paper and practiced several times. Give yourself a pat on the back for that one.
The truth is that planning for disasters, even a loss of electricity, is hard. However, it isn’t something that you should ignore or try to figure out on the fly. Developing and testing a Disaster Recovery Plan is a critical IT function and should be treated as the foundation of a data backup strategy, rather than an afterthought to one. Frequently we’ll see requests that go like this, “We don’t want to do a full DR plan. Can you just give us a 1 page summary of a real one so we can check this box on our audit?” Sure, we can do that…and you can cling to that table of contents while your systems are down for days.
Understanding how your systems operate during times of crisis is equally as important as understanding how they operate normally. It’s been said that you don’t truly know how your business runs until it doesn’t. To realize in a crisis event that the only person who knew how to configure your critical app left 2 years ago, without any documentation or backup, would be a serious blow to the organization. Do the discovery, do the planning and do the testing. Disaster recovery is difficult, shouldn’t you get to practice it first?
Backups, Backups, Backups
It should go without saying that you need backups. With virtualization firmly entrenched there is no reason not to have full server images both onsite and off. Companies like Axcient and Datto are offering enterprise-level recovery capability at SMB prices so there is no excuse for your backups let you down. And don’t just back up your servers…back up your device configs, application configs, your VM host configs. Back up everything and make sure its somewhere that you can get to when everything is off.
Anatomy of Availability
With that, let’s review the anatomy of availability of an application….the components necessary for your user to click the application icon on their desktop and have it do something. Your overall IT infrastructure philosophy should focus on mitigating failures of these components. Your DR plan, in turn, should detail how to recover from a failure of any or all of them in the time required (Recovery Time Objective) and with the minimal acceptable amount of data loss (Recovery Point Objective).
Each of these items constitutes a necessary component of availability and a potential region of failure. It’s important to understand the implications of a failure among any one of these and document that recovery process.
- Power – Utility power, server room circuits, UPSes
- Servers – Physical and virtual servers, application tier reliances
- Storage – Local or SAN storage for servers and data
- Network – Switches, firewalls, routers, appliances, wireless APs, Power over Ethernet
- Internet Service – Web browsing, email, cloud services, SaaS applications, hybrid cloud topologies, VPN
- Name Resolution – Internal and external DNS
- Authentication & Authorization – Active Directory, Identity Management Systems, Dual Factor authentication
- Licensing –License managers, USB key fobs, MAC address requirements
- Workload Specific Configurations -Application deployment variations, bugs, human error
- End User Systems -Power, Network, Internet, Portability, Remote Access capability
Disaster recovery planning is no doubt a difficult challenge, but like most difficult things the reward for completing it is far greater. By analyzing each of these components in your Infrastructure design and documenting them into your DR plan you will not only have a rare look at how your systems operate but a true playbook on how to keep it going.
RSM has consultants that specialize in BCP and DR planning as well as IT Outsourcing. For more information on RSM’s offerings please check out our website. You can also contact RSM’s technology consulting professionals at 800.274.3978 or email us.