The Cloud Backup Problem


The goal of Cloud Data Centers is to improve business agility and financial efficiency. One of the ways they accomplish this is through the heavy use of virtualization. Even if you don't consider yourself a cloud data center but are increasing your use of server virtualization, data protection becomes a challenge in virtualized environments. The reason for this is that you are virtually stacking dozens of servers onto a single physical host. Suddenly each host has the responsibility to protect 12X or more the amount of systems that it used to. This means that more data is sent per host across the network to the backup server and backup storage devices. While server virtualization does not cause data growth per se, since most VM’s are stored on shared enterprise disk more capacity is now under the responsibility of the data center. In other words there is more data to protect and less points from which that data is being sent.



The Choices To Solve The Cloud Backup Problem


In the early stages of virtualization where VM density is not that significant, less than five, and the business importance of those VMs is not as critical, protecting data in a traditional model was probably OK. Install a backup agent in each VM guest and have it backup as normal. As virtualization moves to the next phase, where VM density begins to reach double digits per host and more business important applications are virtualized, that model may no longer be sustainable, a change is needed. There are two basic options, extend your existing solution or move to a new solution.


The first option, extend, typically involves adding VMware intelligence to your existing backup software through the use of modules that are VMware aware. These modules typically have the ability to communicate with VMware’s Vsphere Storage API to backup data without an agent being installed in each individual guest operating system. The advantage of these solutions is that you can get greater backup scalability and enhanced VMware data protection without having to learn a new backup application. Deduplication targets bring value to these VMware aware backup technologies because the growth in primary capacity means that capacity also has to be stored and managed on the backup systems. These solutions combined with disk targets like EMC/Data Domain's deduplication appliances the capacity growth can be kept in check and information can be retained on disk longer. In fact because of the high level of data redundancy within a virtualized infrastructure, deduplication provides some of its best efficiency ratios in deduplicated environments.


Another option is to add one of the VMware specific data protection product to the environment and run it along side of your current backup application. This allows you to use your legacy backup application for protecting the non-virtualized environment and the new solution for the virtualized environment. The challenge is consolidating the backup storage, you don’t want to have to manage two storage areas. Again a potential solution is to use a deduplication target like Data Domain’s that can accept backup jobs from multiple sources. This allows you to leverage capacity optimization across more platforms and still have a single storage area for backups and replication.


Replication is another area where deduplication can shine in a virtualized data protection strategy. Deduplication systems like Data Domain's provide a WAN optimized replication capability that will handle much of the disaster recovery needs that these systems require. While moving to a virtualized environment also often means most VM images are now SAN based, using the SAN for replication may be an expensive option for the typically non-mission critical servers. This is because SAN replication software often requires the exact same system at the exact same capacity be available at the DR site. Alternatively reduplicated systems only require a fraction of the capacity footprint in the DR site and while a recovery mode is required to restore the data, it is disk based and should provide an acceptable return to operations window for many virtual machines.


The other option, which could be done up front or as the next step in virtualization, is to implement source side deduplication strategy similar to EMC Avamar. This allows for redundant data to be eliminated and not stored on the target device. Once again these solutions tend to have an agent that will communicate with the vsphere vStorage API. This helps deduplication maybe even more so than traditional backup. Now the only data that needs to be checked for redundancy is data that recently changed which is exactly what the API identifies for the backup application. This makes the examination process in source side deduplication more efficient.


The Avamar system can also leverage the recently announced integration with Data Domain so that the systems can be used in a single environment. Avamar can leverage Data Domain storage when source side deduplication is not appropriate but leverage it’s technology when it dos. This again can be ideal when looking to implement a new strategy on the virtualized environment but not change the existing strategy on the legacy environment.



Storage Swiss Take


The excuse for not changing your backup strategy when moving to a virtualized environment used to be that the available products were not robust and that VMware's backup intelligence is lacking. That is no longer the case. The vStorage API provides excellent backup intelligence that most backup applications are now taking advantage of. The choice now is not if you will change your backup strategy but how. Will you extend your backup applications capabilities, extend VMware's capabilities or will you use a new method of backup altogether? Each has its advantages. Selecting the right one depends on what your environment looks like and what you are expecting from your data protection process. As we visit EMC World we will detail the EMC approach to each one of these options.

George Crump, Senior Analyst

Briefing Report

EMC is a client of Storage Switzerland