The problem is that few, if any, disk backup systems have all these capabilities in one unit. In many environments the type of disk system implemented largely depends on the objective for its use at the time. These objectives though can change and lead to multiple disk backup targets having to be managed within the same backup process.

For example, what could start as a need for medium-term retention, cost effectively handled by a deduplication system, where performance is not a big issue could evolve into a need for high performance backup of specific mission critical servers. In this second situation, a separate, high performance, cost effective array is often added to the backup infrastructure for that task but it may lack the deduplication capabilities of the former system. Then, power availability could become an issue. Instead of using tape, the decision could be to use MAID technology for long term retention of backups. Again, another system would need to be purchased to store those backups.

Not only do the capabilities of each device vary but so does the scalability. While there are some systems that can scale by adding nodes, these often come with a higher upfront cost. This leads to the dominant disk backup system being supported by a simple dual controller disk array with a set capacity ceiling which is essentially how many drives can be squeezed into the case. When one of these devices fills up the options are to lower retention times, add an entirely different system or most commonly, add another device of the same type. The result is often multiple disk backup systems all of the same type bought multiple times in an effort to meet capacity demands, but needing to be managed separately within the backup application. The management of all of these targets is further compounded by the fact that multiple backup applications exist in the environment. The whole effort to improve the backup process by implementing disk quickly becomes a ‘nightmare’.

Backup virtualization helps resolve these issues by abstracting the backup application from the multitude of targets that the infrastructure may end up using. When backup virtualization is used to manage the above process operational simplicity is achieved as is better utilization of the hardware investment. The various disk targets, as well as any tape targets, are connected to the backup virtualization appliance. The software applications all point to a single device, something that appears to be a tape library, albeit virtual. Then the way data is stored on the available devices is controlled from a single point of management, the backup virtualization appliance.

With backup virtualization all the devices listed in the evolving infrastructure scenario above could be included. For example, the high speed disk could be used as a caching area that all backups are directed to. These backups, as they complete, could be shifted to the deduplication system for medium-term storage. Then, as the backup ages it could be moved to the MAID system for better power efficiency and then finally to an available tape system for long term retention. Mixing disk and tape provides the added comfort of data being available on two different forms of media (disk and tape). It’s important to note that the movement of data is all done without the involvement of the backup application or the resources of the server it’s running on. The backup virtualization appliance handles and manages all the resulting transfers while at the same time keeping the data formats compatible with the individual applications.

In the case of dealing with systems of the same type but where multiple units were purchased to meet capacity demands, backup virtualization can automatically handle this challenge as well. The multiple disk backup systems can be connected directly to the backup virtualization appliance, and as one fills up data can flow to the next available device. When there are no more devices another unit can be added and backup can automatically start using it. Meanwhile, the backup application continues to write data to the same original backup appliance which continues to appear to be a standard tape library.

These two challenges, multiple disk backup systems of the same type and multiple backup systems of different capabilities, are often both found within the same data center. Again, backup virtualization can solve the combined problem as easily as it can handle the individual problem. It also addresses the final challenge of determining where tape fits in. Although there is a propensity by many to look to disk as the primary backup target, tape is still prevalent in most data centers, a situation that looks to continue. The cost and capacity advantage that tape provides can’t be ignored. Plus as mentioned earlier there’s additional comfort in having data protected on two entirely different forms of media.

Backup virtualization brings more to the backup process than just herding together the various options in disk based backup and addressing the scaling requirement. It provides the ability for ‘non-disk friendly’ applications to leverage disk to a fuller extent as well as improves tape drive backup and recovery performance issues that lead to so many disk backup projects in the first place. No matter how many devices are in the backup process, backup virtualization can simplify operations, increase resource efficiency and reduce hard capital costs.

George Crump, Senior Analyst

Tributary Systems, Inc. is a client of Storage Switzerland