Protecting the Archive
Protecting the Archive
As we discussed in our article on Disk Based Archiving, moving data from primary storage to a secondary tier of storage provides significant benefits to an organization. A secondary tier reduces the need for additional primary storage purchases, reduces backup windows along with the backup infrastructure investment and it establishes a foundation for a data retention and compliance policy. Ideally, for these benefits to be realized, this secondary tier (or archive) needs to contain the sole copy of this less active data. Protecting the archive tier therefore requires a highly redundant, highly available system.
If the archive tier is going to contain the only copy of a piece of data, that tier cannot just be a bunch of cheap disks behind a cheap NAS head. The level of redundancy that’s going to be required will need a purpose-built storage system designed from the ground up for reliability and longevity. Fortunately, the other requirements of an archive system, scalability and capacity optimization, can be leveraged as part of the archive protection process.
Tuesday, February 16, 2010
Systems like those provided by Permabit Technology, for example, have tackled the scale requirement by building a cluster of storage nodes. When more capacity is needed, simply add another node. The system automatically recognizes the new node and starts using the new capacity. These nodes also provide a redundant architecture that’s leveraged to maintain data availability. An individual node or even two nodes (simultaneously) can fail without any data loss or loss of access to that data.
Secondly, the system can leverage the clustered storage architecture to provide an advanced form of data protection. This is critical because as drive capacities continue to expand, especially with 2 TB drives now coming into the mainstream, traditional RAID 5 and even RAID 6 configurations begin to reach their practical limits for system recovery time without data loss or corruption.
The challenge with traditional RAID is the time it takes to return a storage system to a fully working and redundantly protected state. If a drive fails, most RAID technologies will start a rebuild process as soon as a global spare can be identified or a failed drive replaced. With high capacity 1TB and 2TB drives, this rebuild can take many hours and in some cases the rebuild time can approach days. During this rebuild time the archive data, the organization’s only copy of that data, is totally exposed. If a second drive fails, that data may be lost forever because the probability of a read failure during rebuild time increases dramatically. Even though RAID 6 provides some level of protection, with extremely long rebuild times, the chances of a third drive failure occurring before this process completes increases because typically there are more drives involved due to larger drive configurations. While the chances of that happening may be relatively low, just the risk of 100% data loss of this sole copy of data has to be concerning. A more effective and fail safe protection is needed.
The archive system can address this challenge in one of two ways; either through a mirror or by using an advanced form of RAID. In smaller implementations, mirroring is a simple alternative. While there is a higher ‘capacity cost’ with a mirror’s second copy, the redundancy allows for rapid recovery. For most smaller implementations, the initial size of the archive more than compensates for the loss of capacity caused by the mirror.
In larger environments, the mirror’s one-for-one copy of all the data can become cost prohibitive and is a reason why RAID is often chosen instead. However, RAID doesn’t address the data risk issue, as simply rebuilding RAID sets when errors are detected is not an effective way to increase data integrity. The alternative is to use an advanced form of RAID. Permabit, for example, uses a technology called RAIN-EC (Redundant Array of Independent Nodes) that breaks data into multiple chunks and distributes them across separate drives located on separate storage nodes. If a node fails, the remaining chunks can be assembled and present that data. In fact 2 nodes (each node today contains 4 drives) could fail and the data would still be intact. The effect is a much more robust protection algorithm than the parity offered by RAID that delivers greater redundancy and faster recoveries.
While failure of nodes and drives is fairly obvious, what may be more concerning is a ‘silent data loss’ situation. In this scenario, a drive could degrade to the point where it hasn’t actually failed, but data on the drive has been corrupted. How can this type of corruption be detected? With traditional systems the only way to confirm a corruption has occurred is when that data is read. If there are multiple copies of this data spread out on disk and tape then recovery may be possible from one of those devices. But having redundant copies of data defeats the purpose of the archive in the first place - as well as adding cost.
An archive system can eliminate this concern by leveraging another technology that’s very well known, just not for data protection. Deduplication is another design decision that archive systems can use to optimize storage capacity. The deduplication algorithm generates a signature for each segment of data that is written to it. This signature is unique to that data segment. If the signature appears again, then the second copy of the data is not written to disk, instead a reference is made to the original signature and space is saved.
The archive can use this signature to protect the data it’s storing as well as to optimize space. Archive systems can leverage this signature information to verify the data contained on disk. Periodically, the system will rerun the algorithm on the data segments that it has stored. That signature should be the same every time the algorithm is run. If not there has been some sort of corruption. Because the system has the ability to regenerate data from its unique RAIN protection strategy or because of a mirror, the data corruption can be ‘repaired’ and the information can be salvaged.
Replication is a cornerstone need when creating a fully optimized archive strategy. It provides a second but managed copy of this, ‘original copy’ of data. It protects not only against a site failure, but also against some other form of data loss even if all the other protection steps fail. Still, this is a single managed copy so that it will adhere to the same data retention strategy as the original archive copy.
Deduplication is again leveraged to enable a WAN-efficient data replication strategy. In this form of replication only the changed blocks that are unique to the target system at the DR site are transferred, even if the source data is coming from multiple sites. For example, three sites could be replicating to a single DR site. When one of the primary sites prepares to send its recently changed or added data, if some of the data already exists at the remote site, the data is not sent. This method not only reduces the amount of data that has to be stored at a DR site, it also reduces the amount of WAN bandwidth required.
For disk archiving to deliver on the promise of reducing storage and protection costs, customers have to feel confident in its ability to securely and reliably house a single copy of data. Protecting the integrity of archive data long term is critical to making sure that this copy is not lost.
George Crump, Senior Analyst
Related Articles
Faster Primary Storage with Data Dedupe
Primary Storage Deduplication, Demand It
Dedupe Improves Primary Storage Efficiency
SMB NAS is Deduplication's Next Step
Primary Storage Dedupe Addresses Data Gap
How Should Primary Storage Be Delivered
Storage Industry Consolidation & Dedupe
Primary Storage: Dedupe vs. Compression
Making Primary Storage Dedupe Safe
High Performance Primary Storage Dedupe
Automated Tiering or Disk Archiving?
Global Healthcare Leader - Disk Archive
Optimization - New Normal in Storage
Can’t Deduplicate Admin Workload
Managing VM Sprawl - Disk Archive
The Foundation of Dedupe’s Era
Weaknesses of Dedupe - Retention
This Article Sponsored by Permabit