Storage Vendors
Storage Vendors
Deduplication has seen its best success as a technology used to help optimize the use of disk as part of the backup process. While thought has been given to using deduplication elsewhere, especially in primary storage, not much progress has occurred. Only two primary storage vendors have the technology even available and only one OS vendor has released a primary storage deduplication capability. There have also been a few third party appliances to deliver deduplication on primary storage. The challenge is that none of these solutions, as of now, have provided what end users want, deduplication without the cost of performance, change of process and most importantly without risk of data loss. Storage suppliers have to respond quickly to these demands as an early competitor’s lead here could be an unsurvivable disadvantage.
Monday, June 7, 2010
End Users Want Primary Storage Deduplication
IT users want deduplication in primary storage. The critical issue is they just don’t want to give anything up to get it; no performance loss, no loss of storage capabilities and no risk of data corruption. The benefits of deduplication in primary storage are similar to when the technology is used elsewhere; store more data with less physical space which substantially drives down the effective cost per GB. This also leads to reduced power, cooling and footprint costs, all of which are current issues in the data center. In addition, if deduplication can be done in primary storage it has a continuing efficiency benefit as the data rolls downstream through its lifecycle. It eliminates the need to deduplicate again when it is snapshotted, copied, backed up or archived. The total end user value of primary storage deduplication is something that Storage Switzerland will explore in a future article. For now though it is clear that there is both demand from and value seen from IT personnel for primary storage deduplication.
Storage System Suppliers Have Other Priorities
Storage system suppliers need to develop primary storage deduplication. The challenge is that they have other important initiatives on their agenda and developing a deduplication engine from the ground up is no easy task. An example of one of the big technology initiatives is the move on the part of storage vendors to unify storage, the ability for a single storage array to offer NAS services like CIFS and NFS as well as block services like fibre and iSCSI. Another storage initiative that most suppliers are embarking on is automated tiering. This capability allows sub-sections of volumes, directories and even files to be moved dynamically to differing types of storage including SSD, Fibre and SATA.
These initiatives directly impact the vendors’ ability to implement deduplication. Not only do they have to decide to allocate development resources and funding, but they also need to decide how to implement deduplication that will complement their storage systems and the coming initiatives mentioned earlier. Do they take an easier path from a development perspective and offer a different deduplication option on each of the above protocols and storage tiers? That’s not very unifying. Or do they offer a single solution that covers all the protocols and storage tiers? A unified deduplication strategy, although ideal, will be more difficult, more costly and time consuming to develop, especially when factoring in scale and performance requirements and neither choice is a particularly quick development effort.
It seems like making this choice has had a paralyzing impact on storage vendors and as a result has limited the numbers of primary storage deduplication products that have emerged. Most vendors are stuck trying to decide which path to take! Those that have made a deduplication development effort, thus far, have chosen the not very unified, typically file system based approach which satisfies the “I’ve got something” need but has limited scale and performance is lacking.
Most storage suppliers realize that the unified approach to deduplication is the best way to support functionality like unified protocols, auto-tiering and other future storage capabilities. These types of capabilities will likely require a tight interaction between deduplication and these functions.
While primary storage deduplication has seen some success as an add-on appliance in specific industries and use cases, broader adoption may be best triggered by it being an integrated or embedded function within the operating system or storage controller. The challenge for the supplier is how can they get this level of integration without developing it themselves and how do they develop deduplication without delaying other more competitively critical projects?
The Risk Of Not Having Deduplication In Primary Storage
If storage suppliers continue to try to ignore pent up demand for primary storage deduplication then they are going to be at a severe disadvantage. Competitive economics become overwhelming. A storage vendor armed with global primary storage deduplication can effectively drive costs down by 20 to 60% depending on the dataset. It also drives out cost in the rest of the process; snapshots, clones, backup copies all become nearly free from a resource perspective. All of this allows them to add value and maintain margins while driving down the cost to their users.
If a storage vendor continues to drag their feet and not make a bold move into deduplication they then have to compete with the dedupe armed vendor, with a product whose effective cost remains the same. They are also not showing value to their customers and will look behind the technology curve. To maintain competitiveness they will have to offer more capacity at lower prices resulting in lower margins! Not a good financial strategy! Imagine selling backup systems and not having access to low cost SATA drives for storage, being forced to use high cost fibre. That is what it will be like for a storage supplier not to have primary deduplication! The competitive disadvantage will increase as more vendors deploy deduplication. Thus continuing to increase margin pressure on those that do not deploy. If the competitive disadvantage persists it may significantly impact the viability of the vendor.
Development Is Not An Option
The reality is that for storage vendors to remain competitive in the primary storage market they must be able to add deduplication within the next two years or be faced with an insurmountable competitive disadvantage. However, if a serious deduplication effort is not already underway and either deliverable or close to deliverable, then it may be too late to start from scratch and develop something that meets tightly integrated, unified performance and scalability requirements.
The logical alternative is the ability to OEM a solution from a deduplication specialist. An add-on appliance approach may provide a stop gap measure in certain environments. It may also suffer from limited scalability, forced operational changes and a performance impact. The horizontal market is going to insist on an integrated efficient and scalable solution. That is what an OEM solution like Permabit’s Albireo delivers, a tightly integrated API set that manufacturers can integrate into their existing storage system.
Using an OEM strategy will allow storage system manufacturers to deliver to their customers next generation storage systems that will reduce the cost exposures from a continual and increasing need for more capacity. The OEM can focus on the key customer demands of not impacting performance, data services features or risking data integrity. With the right OEM selection, the storage supplier can add their own unique value to the deduplication engine to differentiate themselves from other suppliers. The critical value in an OEM strategy is that the storage supplier can keep focused on the other storage management features that their customers are demanding like protocol unification and automated storage tiering and that the storage manufacturer is very well equipped to provide.
Primary storage deduplication will soon be a “must have feature”. The capability will be expected in the same way snapshot capability is expected today. Vendors that don’t have it as a deliverable to their customers within the next one to two years will be seriously disadvantaged.
Finally, a move to primary storage deduplication will happen significantly faster and more universally than in the backup use case. Backup deduplication required a change in thought and in many cases process. Primary storage deduplication, if embedded into the storage software, will not. It will simply be a switch that is turned on that will deliver substantial benefit.
In an upcoming article series we will explore the key performance issues with deduplication and how they can be addressed to create a zero impact solution as well as how deduplication can be implemented in a way that it does not alter the data that the storage system is responsible for.
George Crump, Senior Analyst
Permabit Technology is a client of Storage Switzerland
Related Articles
Making Primary Storage Dedupe Safe
High Performance Primary Storage Dedupe
Automated Tiering or Disk Archiving?
Can’t Deduplicate Admin Workload
Managing VM Sprawl with Disk Archive
Optimization - the New Normal in Storage
The Foundation of Dedupe’s Next Era
Weaknesses of Deduplication Backup...
- The Deduplication Stakes Are Raised