The Situation


Help at Home has 85 offices that connect to the main data center in Chicago via an MPLS network for applications and internet connectivity. The data center has a domain controller and telecom applications running on separate physical servers, but all business applications running on VMware, including an SQL-based ERP system and .NET applications. They also have an HP Lefthand Networks iSCSI SAN supporting VMs, and a SonicWall CDP appliance providing backup.


When it was clear the SonicWall system couldn’t provide the off-site data protection and DR capability they needed, Heidrich started looking. Initially, he explored setting up a secondary data center at a remote site. But this meant buying a duplicate infrastructure to what was in the main data center, something he estimated would cost between $150-$200K for hardware, software and implementation. And, they would have to operate and maintain it themselves.



The Alternatives


Then Heidrich considered looking for a remote site that had a Lefthand Networks SAN to which they could replicate their existing data. Aside from the difficulty in finding a provider with this specific hardware, this solution also meant that, in the event of a disaster they would only have data restored, not the applications. They wouldn’t be up and running without significant effort on his team’s part to restart application servers - possibly replacing those servers first. The more he thought about it, Heidrich realized what they really needed was a remote data center to put an ESX server in, so he could replicate his VM image files to it and bring the applications up if the primary data center was inoperable.


Then the light went on. “I thought, why don’t I just find someone to take our VM images and host them? We really didn’t need VM instances running at the second site, except in case of a DR event, and we didn’t need the infrastructure just sitting there in the mean time” said Heidrich.


The colocation facilities he looked at could support an ESX server, but he didn’t really need a full time server, running as a hot standby. Plus, he had to do the entire project himself, colo’s didn’t provide any help with implementation or operations. The cloud storage providers he looked at also didn’t really offer a complete solution, as he needed more than just storage capacity.



The Cloud Solution


Then Heidrich found iland through an internet search. Their Continuity Cloud solution provided him with storage capacity to hold VM image files and standby servers to host his VMs. He replicates updates to these files daily, using Vizioncore’s vReplicator. The VMs themselves are sitting idle on the host servers, which dramatically lowers the cost he’s charged by the provider. If/when they have a DR event, they’ll boot from these VMs and run until their main data center is back in operation. When VMs are active, the cost of course does go up, but in the meantime they’re not paying for compute power to just sit and wait for such an event. Heidrich estimates this failover could be started within minutes (he can actually initiate the process from his PDA) and take up to an hour or two to complete, which more than meets their internal requirements.


Iland’s team worked with Help at Home remotely for the initial configuration. This included setting up a site to site VPN connection and configuring the actual replication jobs between Help at Home’s corporate site and iland’s Cloud. The Continuity Cloud solution was also flexible enough to provide the level of services and involvement he wants to take on, and can be upgraded at any time. For example, Heidrich is planning to upgrade from a standby VM to a dedicated server to host his mission critical SQL application. This would enable multiple updates during each day from his production SQL databases to the server at iland, and provide a quicker return to operations in a DR event as well as lowering the data loss window.



Was the Cloud Risky?


Heidrich was confident with the security of the iland infrastructure and didn’t have other medical industry regulatory restrictions on data storage. But he did bring up a different sort of risk that entered into the evaluation of alternatives in this project. He talked about the risks of actually doing a project like this in-house, including the option of outsourcing the infrastructure to a provider that didn’t offer an end-to-end solution.


Taking on a DR project, especially with the IT resources typical in a smaller organization, can bring its own risks. Designing and implementing a system like this can present significant unknowns that can derail the project and cause it to take much more time and cost much more than originally estimated. Heidrich concluded that given the size of his shop, the margin for error was too thin to take on this kind of uncertainty. They simply couldn’t afford to have him pulled away from day to day IT operations to shepherd a DR project that ‘went south’.


When the CapEx costs were factored in, along with the ongoing operational costs of the alternatives, outsourcing to the cloud was the only cost-effective alternative. It also turned out to be the lowest risk solution, and the one that afforded the most flexibility.

Case Study

Eric Slack, Senior Analyst