Global Deduplication Array

The GDA is essentially two of Data Domain’s DD880 deduplication systems combined with intelligent software to determine which array, called controllers in this form, to send the data to. This allows the systems to appear as one to the rest of the backup software, providing a simple scaling solution. In the past scaling had to come by adding separate Data Domain systems independently attached to the backup server. This required the backup administrator to figure out how to best utilize the systems and to manually balance the load between them. The GDA software now manages all that balancing for the backup administrator. They simply aim backup jobs at what appears to be a single target and the Data Domain software figures out the rest. That even includes rebalancing entire backup data sets if need be.

The combined systems provide the Data Domain customer with the highest level of scale ever offered in their product line. The GDA scales to 12.8TB per hour transfer speed and up to 285TB of capacity (before optimization). Ultimately this delivers petabytes of logical (post optimization) capacity, earmarking the GDA for large enterprise backup environments. It can support an increased number of concurrent write streams as well as increased remote fan-in connections as a replication target.

Data Domain Boost Software

One of the key enablers of the GDA is the new DD Boost software that EMC just announced. In addition to managing traffic between the controllers on a GDA it also provides value to any single controller Data Domain system that customers are using with Symantec NetBackup and OST or, in the latter half of the year, EMC's NetWorker. DD Boost achieves additional throughput performance benefits for Data Domain systems by distributing parts of the deduplication process to the backup servers. DD Boost can also reduce the load on the backup server and make transfers faster by eliminating much of the redundant data that would otherwise be sent to the Data Domain system.

The DD Boost plug-in is installed on the backup server and will generate the hash for data segments as the backup server processes the inbound data. It will then send that hash to the Data Domain appliance. The appliance will determine if it has seen that segment before. If it has, the Data Domain system will tell the plug-in not to send the data. If the hash is new, it will request the actual data be sent across the network and will store that data.

The net effect of DD Boost should be a dramatic reduction in the amount of data that the backup server needs to transfer to the Data Domain system. This comes without making the backup server do all the heavy lifting that a complete backup server-based deduplication system would require.

This all happens without changing backup server, backup software or most importantly backup infrastructure. If a Data Domain customer using NetBackup or BackupExec or soon NetWorker is pushing the limits today of any of these elements, simply installing the plug-in should produce a noticeable increase in performance.

Briefing Report

George Crump, Senior Analyst

- GDA and Boost

Data Domain is a client of Storage Switzerland