Optimization - the New Normal in Storage
Optimization - the New Normal in Storage
Relentless data growth and continually dropping storage costs has for years fostered routinely inefficient storage practices - even wastefulness. Thanks in part to Moore’s Law (as applied to storage) and ever increasing IT budgets, it seems that the cost per GB has dropped regularly and capital availability has kept much of the business pressure off of storage managers. But no more. This practice of buying more storage to meet anticipated data growth because it’s easier than implementing sound storage policies has come to an end. The ‘New Normal’ is to mandate storage optimization as a prerequisite for storage investments in 2009/10 and beyond.
Wednesday, February 3, 2010
The drivers for this are varied, but a primary one is the reduction or elimination of storage line items from budgets as a result of economic pressures in 2008/9. It seems management has lost its patience for sub-optimal storage practices, even as storage costs per GB continue to drop. Luckily technologies currently exist to fix this problem and their implementation is simple and yields significant financial return. These include more cost effective storage tiering and deduplication that’s available for all appropriate data sets, not just backups. Whatever the reasons, it looks as if storage growth will now come with a requirement for its optimization. But how do you define optimization and how do you implement it?
What is Storage Optimization
The dictionary defines “optimize” as: “to make as effective, perfect or useful as possible; to make the best of”. In this context, it means cost-efficient, or getting the most use out of an asset like storage. Since the purpose of storage is to house information, it would follow that reducing its cost would also mean reducing the amount of storage needed. Consequently, storage optimization would mean minimizing the capacity required to house an organization’s information set and reducing the Total Cost of Ownership (TCO) per average GB that is required.
Reducing the capacity required amounts to eliminating duplication and wasted space. For most environments significant duplication exists in the backup system, as largely unchanged data sets are backed up daily, weekly, monthly, etc. Often, the final copy of the data is kept as an ‘archive’ for years - or even decades. Also, duplicate files are created to support shared work and committee projects. Wasted space, apart from duplicate copies of data, can come from storing valueless data. Obsolete files, shares created by employees that have left the company and personal data, are all examples of data with no corporate value.
Reducing the average TCO of storage capacity that’s needed is the second part of the optimization equation. The components of this are a tiered storage system, the cost of each tier of storage and the percentage of total data thats on each tier.
Simply put, a tiered storage system that can migrate more data to lower tiers will subsequently achieve a lower average storage cost. To do this, lower tiers need to be more reliable than in the past, and data movement or migration systems need to incorporate more existing data platforms.
Implementing Storage Optimization
Storage optimization requires systems designed to reduce duplication and wasted space and support dynamic storage tiering, like Permabit’s Enterprise Archive. These systems deduplicate and compress data globally, in real time, making them effective for primary storage, in some cases. Since they’re not just deduping backups, they can apply dedupe’s capacity savings across more of the storage infrastructure, not just to backed up data. They also don’t need dedicated space to cache data during the dedupe process, like some systems do, further saving capacity.
Data put into an enterprise archive system isn’t part of the typical backup cycle because its been more to another storage tier, reducing overall backup system activity and the storage it consumes. This is a fundamental improvement in optimization, since it eliminates an entire storage operation on a significant portion of the organization’s data. Enterprise archives also eliminate the inefficient practice of keeping the oldest backups as a long-term archive. With older data residing in a true archive system, it’s more easily culled and deleted as its value diminishes or corporate governance and compliance requirements allow. Alternatively, restoring old backups to remove specific files can be a costly process.
Given the high rate of data growth, it’s easy to predict that much larger data stores will evolve. The current RAID technologies (developed in the 1980s) cannot deliver the necessary protection and recoverability that these multiple PB data stores will require. Enterprise archive systems, on the other hand, incorporate data integrity features that have advanced error correction and can recover from multiple component failures while providing more reliability than dual parity RAID. They also scale to PB capacities with clustered architectures while maintaining acceptable performance.
Tiered Storage is no longer optional
Storage tiering is a requirement to reduce the average cost of storage. If a tiered storage infrastructure hasn’t been set up, this should be the first step. It doesn’t have to be all-encompassing, it just needs to be started. Setting the enterprise archive up as the lowest tier makes data migration decisions simple - if it’s not Tier 1 it’s on the ‘value tier’. More tiers and sophistication can be added as needed. An enterprise archive with its increased reliability and performance over traditional high-capacity, low-cost RAID 6 storage can also make it appropriate for a larger percentage of Tier 1 data, further reducing average cost.
Data movement between tiers can be accomplished with archive (HSM/ILM) software packages, data migration software or file virtualization. Archive packages are typically implemented as appliances, integrated into the storage controller hardware or included as part of another application - like backup. Email archives are an example. Data migrators run in the environment, crawling file systems to identify migration candidates, then move them during periods of low activity. All of these data movers can be set to point to the enterprise archive.
Storage optimization is the New Normal in IT, as data growth is supported by getting more usable capacity out of the existing infrastructure. There are a number of technologies available to make this possible, like the Enterprise Archive. These systems have deduplication and compression technologies to reduce stored data volume and data integrity features that make them suitable for primary data sets. When included in a tiered storage infrastructure the enterprise archive can improve storage optimization significantly, enabling IT to support more storage growth.
Eric Slack, Senior Analyst
This Article Sponsored by Permabit