The Impact Of Cloud Data Centers On Backup
The Impact Of Cloud Data Centers On Backup
Storage Switzerland will be at EMC World next week and I can assure that one of the topics you can expect to hear a lot about is cloud. Trust me, everything will be cloudy. One of the subjects that we expect to get into is the often overlooked area of how to protect the virtualized data center. The Google Gmail data loss a few weeks ago made it clear that no matter how good cloud storage infrastructures get at redundancy and replication there is always going to be a need for a cold, point in time copy of data. The challenge is that cloud or heavily virtualized data centers changing the backup parading and backups as usual just won't cut it anymore.
Friday, May 6, 2011
The Cloud Backup Problem
The goal of Cloud Data Centers is to improve business agility and financial efficiency. One of the ways they accomplish this is through the heavy use of virtualization. Even if you don't consider yourself a cloud data center but are increasing your use of server virtualization, data protection becomes a challenge in virtualized environments. The reason for this is that you are virtually stacking dozens of servers onto a single physical host. Suddenly each host has the responsibility to protect 12X or more the amount of systems that it used to. This means that more data is sent per host across the network to the backup server and backup storage devices. While server virtualization does not cause data growth per se, since most VM’s are stored on shared enterprise disk more capacity is now under the responsibility of the data center. In other words there is more data to protect and less points from which that data is being sent.
The Choices To Solve The Cloud Backup Problem
In the early stages of virtualization where VM density is not that significant, less than five, and the business importance of those VMs is not as critical, protecting data in a traditional model was probably OK. Install a backup agent in each VM guest and have it backup as normal. As virtualization moves to the next phase, where VM density begins to reach double digits per host and more business important applications are virtualized, that model may no longer be sustainable, a change is needed. There are two basic options, extend your existing solution or move to a new solution.
The first option, extend, typically involves adding VMware intelligence to your existing backup software through the use of modules that are VMware aware. These modules typically have the ability to communicate with VMware’s Vsphere Storage API to backup data without an agent being installed in each individual guest operating system. The advantage of these solutions is that you can get greater backup scalability and enhanced VMware data protection without having to learn a new backup application. Deduplication targets bring value to these VMware aware backup technologies because the growth in primary capacity means that capacity also has to be stored and managed on the backup systems. These solutions combined with disk targets like EMC/Data Domain's deduplication appliances the capacity growth can be kept in check and information can be retained on disk longer. In fact because of the high level of data redundancy within a virtualized infrastructure, deduplication provides some of its best efficiency ratios in deduplicated environments.
Another option is to add one of the VMware specific data protection product to the environment and run it along side of your current backup application. This allows you to use your legacy backup application for protecting the non-virtualized environment and the new solution for the virtualized environment. The challenge is consolidating the backup storage, you don’t want to have to manage two storage areas. Again a potential solution is to use a deduplication target like Data Domain’s that can accept backup jobs from multiple sources. This allows you to leverage capacity optimization across more platforms and still have a single storage area for backups and replication.
Replication is another area where deduplication can shine in a virtualized data protection strategy. Deduplication systems like Data Domain's provide a WAN optimized replication capability that will handle much of the disaster recovery needs that these systems require. While moving to a virtualized environment also often means most VM images are now SAN based, using the SAN for replication may be an expensive option for the typically non-mission critical servers. This is because SAN replication software often requires the exact same system at the exact same capacity be available at the DR site. Alternatively reduplicated systems only require a fraction of the capacity footprint in the DR site and while a recovery mode is required to restore the data, it is disk based and should provide an acceptable return to operations window for many virtual machines.
The other option, which could be done up front or as the next step in virtualization, is to implement source side deduplication strategy similar to EMC Avamar. This allows for redundant data to be eliminated and not stored on the target device. Once again these solutions tend to have an agent that will communicate with the vsphere vStorage API. This helps deduplication maybe even more so than traditional backup. Now the only data that needs to be checked for redundancy is data that recently changed which is exactly what the API identifies for the backup application. This makes the examination process in source side deduplication more efficient.
The Avamar system can also leverage the recently announced integration with Data Domain so that the systems can be used in a single environment. Avamar can leverage Data Domain storage when source side deduplication is not appropriate but leverage it’s technology when it dos. This again can be ideal when looking to implement a new strategy on the virtualized environment but not change the existing strategy on the legacy environment.
Storage Swiss Take
The excuse for not changing your backup strategy when moving to a virtualized environment used to be that the available products were not robust and that VMware's backup intelligence is lacking. That is no longer the case. The vStorage API provides excellent backup intelligence that most backup applications are now taking advantage of. The choice now is not if you will change your backup strategy but how. Will you extend your backup applications capabilities, extend VMware's capabilities or will you use a new method of backup altogether? Each has its advantages. Selecting the right one depends on what your environment looks like and what you are expecting from your data protection process. As we visit EMC World we will detail the EMC approach to each one of these options.
George Crump, Senior Analyst
Briefing Report
EMC is a client of Storage Switzerland
Related Articles
Use VMworld to Solve Backup Challenges
What are Purpose Built Backup Appliances?
When Does Backup Archiving Make Sense?
Can Automated Tiering Hurt SSD Reliability?
How the Pace of Virtualization Impacts Backup
Archiver Provides Long Term Data Retention
Dedupe Benefits Mainframes & Open Systems
Integrating Disk Backup with Backup Software
Dedupe Storage Eases VMware Backup Pain
Networker Integration of DD Boost Briefing Report
Rewarded by Deduplication for VMware Backup
Client-side Deduplication and VMware
Differences in Deduplicated Backup & Replication
Direct Recovery - Booting VMs from Backup
Leveraging Deduplication for Disaster Recovery
Storage Optimization Dedupe vs. Compression