George Crump, Senior Analyst

How MLC Flash Can Be Made to Last

Solid-state storage has quickly become the go-to choice for solving specific storage performance problems but broad adoption of the technology has not yet occurred in enterprise storage systems. The simple fact is that the cost of solid-state storage requires a serious analysis of the applications involved in order to justify its deployment. Cost remains the primary inhibitor to broad solid-state drive (SSD) adoption.


Using solid-state storage to solve niche performance problems is not a new concept. This technology has been around in one form or another for well over 15 years. Initially, most of that storage was based on DRAM technology, which other than being expensive, was also volatile, meaning that if power was lost, so was the data on the device unless it was backed up to disk, storage media.


Over the past few years, flash memory has been introduced to address two key shortcomings of the early day solid-state storage products, namely cost and volatility. Flash storage is significantly less expensive than DRAM-based systems and is able to retain data even if power is lost.


Flash memory though, has a unique problem in that over time it loses the ability to store data reliably. The more often a given NAND flash cell is written to, the sooner it will “wear out” or become unstable, a situation where data errors can overwhelm the error correction code (ECC) capability of the device. The flash industry has developed different media types to endure a higher number of writes before the device wears out. The most common include Single-Level Cell (SLC) and Multi-Level Cell (MLC). SLC NAND flash media can typically sustain ~100,000 program/erase operations with reliability and MLC NAND flash media approximately 10,000 program/erase operations.


NAND flash cells are written to via an SSD controller, which plays an important role in error correction as well as ‘wear leveling’ - a process that spreads out the write operations across the cells in a flash device to help it wear uniformly. Until now, generic flash controllers have not done much more than this very basic wear management and have relied on the quality of the NAND flash substrate to provide reliable medium for storing data.



SLC and the Enterprise


As a result of deficiencies in the flash controller, the higher program/erase specifications of SLC NAND flash media have become the de facto standard for early solid-state enterprise applications. The problem is that SLC flash is twice as expensive as MLC, and is anywhere from 10 to 15 times more expensive than mechanical hard disk drive storage. While there’s no questioning the potential performance gains that SLC flash can deliver, few applications can justify the cost of a complete changeover to solid-state drives in the enterprise.


The benefits that SLC flash media possess have led to unique implementations of the technology in an effort to keep the cost of that improved performance in check. The simple solution statically places specific components of an application onto a solid-state storage device, but not all of them. This works well for those applications that can justify the cost of the performance improvement and have the granularity to subdivide their data sets to spread across multiple storage types.


For broader applications within the data center, storage vendors have developed ‘data brokering’ types of solutions that either cache or tier the most active data onto solid-state drives. This allows the solid-state storage area to be smaller but leveraged across multiple platforms and doesn’t require manual intervention by the already overworked storage administrator.


These data brokering solutions are ideal for the broad application of a relatively expensive storage technology, like solid-state, so that performance improvements can be brought to a wider range of applications or servers. However, there are inherent problems with this approach. First, being write-intensive, the cache or tier is automatically and continually refreshed, or written to, which almost always requires SLC-based flash media. So, even though the flash memory area is smaller, the data center has to use the most expensive flash option.


Secondly, data brokering does not solve the problem of a cache-miss or tier-miss. If data is not in the solid-state storage area then it must be fetched from the hard drive system. Most cache or tier vendors cost-justify the expense of the solid-state storage area by recommending that the customer use a higher capacity and lower performing hard disk technology in conjunction with solid-state drives. This, of course, is fine until data has to be retrieved from that slower tier.


While the efficiency of the caching algorithm is important, the most effective way of reducing a cache-miss or tier-miss is to expand the physical capacity of the solid-state storage area itself. The problem with this approach is obviously cost, since these areas tend to be SLC-based, because of the high refresh rate.



SLC Alternatives


The cost of reliable solid-state storage technology must come down even further than it has thus far. To this end, vendors are taking a closer look at MLC flash-based SSD technology, whether to make the static placement and automated data movement technologies more accurate (by increasing their size) or to drive towards a pure solid-state data center.


MLC media writes twice as many bits per cell as SLC does and so is roughly half the cost. The problem is that by crowding twice as many bits per cell, MLC media will reach its end of life (the point where the data becomes unreadable), by a factor of 10X compared to SLC, unless something is not done about it. There are two approaches championed to increase the reliability of MLC so that its economics can be taken advantage of.



eMLC


eMLC (or enterprise MLC) increases the program/erase (P/E) cycles of MLC from 10,000 to approximately 30,000 while being only a little more expensive. This is accomplished by slowing down the write process to the device, which allows the device to wear out slower. For media designed to increase performance, this is an odd compromise. While it’s true that the data write speed of eMLC is still much faster than that of mechanical hard drives, the fact that eMLC is much slower than SLC media or ‘regular’ MLC makes for a challenging compromise.


Additionally, the 30,000 P/E cycle potential of eMLC, while certainly better, still may not be up to enterprise standards, especially in write-intensive environments like caching and tiering. As a result, eMLC has been mostly applied in flash-only systems or in flash appliances designed for read-intensive applications. In both cases, the data turnover rate is lower, making the 30,000 write cycle capability of eMLC more palatable.



Advanced SSD Controller Technology


The other alternative to increase MLC reliability and endurance is to use an approach similar to what STEC, Inc. has developed in its SSD controller technology to provide near-SLC life while maintaining MLC costs. This intelligent SSD controller approach also maintains the full read and write performance of original MLC media.


Most NAND flash have similar technology. What makes the difference for SSD implementations is the controller technology that manages how the data is written to flash and how errors are managed or corrected. All storage technologies have to deal with read and write errors. Digital recording allows for sophisticated error correction schemes that can increase the usable life of the recording media, as is the case with NAND flash technology.


Most suppliers of SSDs purchase the SSD controller from a third party source, which means the solid-state supplier does not have direct control over how the flash is managed regardless of whether or not they are the flash manufacturer. For those companies that develop their own SSD controller technology, they have control over the flash management and error correction processes to enable better SSD endurance.


Another aspect of how SSD controller technology can improve the life expectancy of flash storage involves how data is written to a flash cell to begin with. Storing and erasing data requires that a high voltage be applied between the cell substrate and the gate. By carefully managing the intensity of that charge (different functions require different charges), an optimized SSD controller can reduce or ‘soften’ the impact of data writes and reduce the potential damage caused by programming and erasures.


Advanced SSD controllers use these flash management techniques to reduce the breakdown of the flash cells. Combined with improved error correction processes, the result is increased flash endurance for MLC flash-based SSDs.


In the final analysis, SSD controller optimized flash storage, such as STEC’s controller architecture, can offer two times the lifespan of eMLC media without the performance degradation. In other words, this technology can provide as much as 60,000 program/erase cycles at MLC cost points. This provides data center solid-state storage with 60% of the life expectancy of SLC media, at significantly reduced costs.



Impact of the SSD Controller on MLC


The impact of the SSD controller on MLC endurance is leading to a significant reduction in the cost of deploying solid-state storage in the enterprise. It will enable all of the current implementation methods - static placement, caching or tiering - to be even more valuable since the size of the implementation can be effectively doubled for less money. This means larger static placement areas with less cache or tier misses. It also means that flash-only systems that have been leveraging eMLC media can now be twice as reliable, with an improvement in write performance, for approximately one-third the cost.


Of course, advanced SSD controller technologies can be applied to any flash format and effectively double that life expectancy in the process. This can make all forms of flash media more reliable and more consistent over time.



MLC Flash-based SSDs are Key to the Enterprise


Cost remains the key roadblock in the widespread adoption of MLC flash-based SSDs in the enterprise. While there are many technologies that can be used to optimize the space efficiency of flash-based solid-state storage, the simplest answer is to make the product less expensive. MLC brings that price point today and with advancements in SSD controller technologies, those economics can be delivered with the performance, endurance and reliability that the enterprise data center demands.



STEC is a client of Storage Switzerland


Previous Entry:Enterprise Solid State Can’t Go Down