Large scale recovery ? results
Last Updated: 2008-02-02 01:18:38 UTC
by Mark Hofman (Version: 1)
I asked a little while ago what people do with large scale recovery (http://isc.sans.org/diary.html?storyid=3861) should there be a large number of machines in the network that need to be rebuilt. It has taken a bit longer than intended to collate the results, mainly due to a small scale recovery issue on my laptop.
Firstly, thank you all for contributing. The answers were interesting and if there was a prize to give it would go to Dave whose response was to “to QUIT and move to Antigua”.
A number of people are using recovery images on a “hidden” partition on the actual device itself. Some have these password protected to prevent the “oops” re-image factor. In the event of an issue the user can be talked through a recovery, follow a documented procedure, or as some of you are already doing scripted.
There are a number of you that are using bare metal recovery options to push/pull images to/from workstations, using multicast to reduce the load on the networks.
The simplest solution provided was to send DVDs around with the image, but that takes some time.
The biggest issue everyone had was how to keep the image up to date, variety of hardware and road warriors. We can probably add loss of data to that which was stored locally on the machines.
For images, the more common solution was regularly updating the image and if stored on the machines itself, push the new image or just the updates down to the workstation. Others provide updates through scripting or tools, which will work on clean networks, but it might be a race against time if the network still has some worm or other malware running loose.
When dealing with different flavours of hardware, most of you opted for standardizing on specific models for laptops and desktops. Road warriors however were a little more difficult to deal with, especially those that rarely make an appearance in the office. The combination of standardisation of hardware, local recovery images and a patch/image update mechanism seems to be the favourite. There are some of you that utilise locked down, virtual images, on roadwarrior machines. So provided the vm starts, the roadwarrior can still work.
The data issue was reasonably easily dealt with by using AD, Zen or other mechanisms of enforcing policies that ensure data is stored on the network (e.g redirecting My documents to the network). Again road warriors were an issue as they often have no choice but to store info locally. An automated process to back up files when connected to the network seems to be the go.
So that is how many of you would deal with having to rebuild a large number of machines on the network. Thanks for your input.
Mark H - Shearwater