Primary Storage Deduplication and SSD
Primary Storage Deduplication and SSD
Conventional wisdom is to look at solid state disk (SSD) and deduplication as technologies at opposite ends of the storage spectrum. SSD is designed to be a ‘performance at all cost’ type of technology, one which is focused on removing performance latencies. It‘s also designed for the most active primary data sets. Deduplication on the other hand, up until recently has been thought of as a technology for secondary data, focused on reducing costs through efficiency, while potentially adding latency to storage access times. Despite their obvious differences, in reality these two technologies are made for each other.
Should Performance Really Be A Concern
The first concern when marrying SSD and deduplication is going to be the performance impact on the solid state storage device. This is a fair concern given deduplication’s legacy typically found in backup or archive implementations, and in many cases being deployed with a post-process technique, so as to not interfere with active data reads and writes. And those performance concerns arose only when using mechanical HDDs, not zero latency SSDs.
Monday, August 8, 2011
As Storage Switzerland discussed in a recent article, modern deduplication technology has advanced significantly over the past few years. Companies like Permabit, with their Albireo deduplication engine, have been able to provide primary storage deduplication with no noticeable performance impact when using mechanical HDD technology. It’s logical that they should be able to provide deduplication to SSD based systems as well, with minimal or potentially no noticeable impact on performance.
Most storage managers who are considering SSD don’t need the full performance that this technology can provide. They are looking for something in between the performance of a 15K RPM hard drive and solid state, the elusive 20K RPM hard drive. Even if deduplication does cause a performance impact, as long as the SSD is able to maintain a performance level that’s significantly above what a current HDDs can deliver, then the combination of SSD and deduplication is well worth considering.
Deduplication of Premium Cost Storage Means Higher ROI
If primary storage deduplication on SSDs can be implemented such that the device will still handily out perform its mechanical counterparts, then the return on the investment by deduplication will be significantly higher. This is because solid state is purchased at a premium.
The typical MLC (consumer) SSD has a cost of about $3 per GB, eMLC can be about $5 per GB and SLC, as much as $8 per GB, even $10 per GB when installed in a vendor’s storage system. This means that a 100GB enterprise drive implemented in an enterprise configuration can be as much as $500 or more. If that drive with deduplication could instead store 500GB of data, it would mean a $4 reduction in the cost per GB. It would also mean SLC technology in an enterprise configuration could be had for the price of consumer MLC. The bottom line would be that the customer gains the endurance and reliability of SLC without the cost. This takes SSD technology one step closer to being the same cost as mechanical hard drives.
Deduplication of Write Restricted Devices means longer Life
Another benefit of deduplicating an SSD is that it can increase the reliability and lifespan of a deduplicated drive. All Flash based SSD, regardless of the type, have a limited number of write operations they can accept. MLC can sustain 10,000 write cycles, eMLC 30,000 and SLC about 100,000. While vendors have gotten very good at correcting errors and spreading out writes so that the SSD fails consistently, none of these technologies really extend the life of the drive. The only way to do that is to reduce the amount of writes that it has to endure. This is a significant benefit of deduplicating the SSD.
If deduplication can be implemented in an inline fashion, meaning that data segments are examined for uniqueness prior to the write occurring, then the technology can reduce the number of writes to the solid state drive(s). All redundant writes can be eliminated. This not only helps compensate for any potential performance loss as Storage Switzerland discussed in our recent article it also means that in the case of SSDs, less data is written to the actual drive, extending its life.
SSD Benefits Deduplication Too
Deduplication is a serious processing workload. It requires that the system manage meta-data tables, conduct seek operations to find redundant information and of course, it requires the process of actually writing new data. Many deduplication vendors try to do as much of this work as possible in RAM within the storage system. Flash will make all of these functions better. In fact, it’s possible that some of the components that are stored in RAM could be moved to the Flash based storage too. This would maintain look-up performance while reducing the cost to implement deduplication in the storage system, since less of the expensive DRAM memory will be needed to be allocated to the meta-data table management. It also means that the deduplication process could be expanded to provide storage efficiency to a larger storage environment. Combining RAM and SSD for the indexing can significantly scale out the overall deduplicated storage pool since the typical delimiter on scaling out deduplication is to avoid index fetches on spinning disk.
Summary
SSD and deduplication don’t serve competing interests after all. They are a perfect complement to each other. Deduplication brings cost effectiveness to a premium priced storage platform and SSD enables greater responsiveness and scalability to the solid state storage system. The combination of these technologies is driving the cost of solid state storage down while driving the capacity up, so that it’s now better able to meet the needs of the enterprise.
Permabit Technology is a client of Storage Switzerland
George Crump, Senior Analyst
Related Articles
Faster Primary Storage with Data Dedupe
Primary Storage Deduplication, Demand It
Dedupe Improves Primary Storage Efficiency
SMB NAS is Deduplication's Next Step
Primary Storage Dedupe Addresses Data Gap
How Should Primary Storage Be Delivered
Storage Industry Consolidation & Dedupe
Primary Storage: Dedupe vs. Compression
Making Primary Storage Dedupe Safe
High Performance Primary Storage Dedupe
Automated Tiering or Disk Archiving?
Can’t Deduplicate Admin Workload
Managing VM Sprawl with Disk Archive
Optimization - the New Normal in Storage
The Foundation of Dedupe’s Next Era
Weaknesses of Deduplication Backup...