SSD as a Tier

The legacy storage vendors have tried to provide solid state based solutions through a variety of means. The most common has been to implement SSD as a tier of storage via drive form-factor solid state devices. The advantage to this implementation method is that SSDs can plug into the manufacturer’s current storage shelves, making broad distribution easy for the supplier. But this method is not easy for the storage manager. The SSD tier must be treated separately and data must be manually moved to and from the tier as needed. The typical result is that data is moved to the high speed tier once and then the tier is seldom refreshed, because the IT staff gets busy on other projects. As the application ages it may not need the benefits of the high speed tier or another application may have a more pressing need at times. In other words this premium storage resource is used inefficiently. Poor resource utilization may be acceptable on less expensive mechanical storage but not on premium priced solid state storage. As a result many manufacturers have begun (or are planning) to offer the ability to automatically move data between the SSD tier and the mechanical tier.

Automating the SSD Tier

Automated tiering provides the storage system the ability to migrate data to the faster solid state tier as it’s being accessed. These storage systems use a sophisticated algorithm that measures I/O intensity to promote data into SSD as needed. While this approach is an improvement over manually moving data back and forth between tiers it does have some shortcomings.

One set of weaknesses comes from the technology itself. Most, if not all, implementations of this method will make sure that all new data (writes) go to the mechanical hard drive tier first. It is not until after the written data becomes ‘popular’ that it’s promoted to the solid state caching tier. This means that the most expensive I/Os (writes) from a resource and performance perspective are always going to the slowest tier first. Another challenge with this method is that the most expensive tier of storage is not even utilized initially. In some implementations it can take hours for the SSD tier to ‘warm up’ or to accumulate enough information on data ‘popularity’ in order to make move decisions. Then there is still the time required to write that data to the tier.

Finally automated tiering involves a lot of copying back and forth between the disk and SSD tiers. When a set of data blocks gets promoted to the SSD tier other data must be moved out first to make room. This means for every promotion there are at least two writes, one to move data into the SSD tier, another to move an old chunk of data out to the mechanical tier. And these writes are not small, as many storage system suppliers are recommending that promoted chunks of data be as much as 128MB is size. In a busy system, which most SSD-enhanced storage systems are, this could be thousands or millions of extra data transfers of measurable size, creating its own performance problem overall.

In addition, each storage system will need its own automated tiering storage software. That means that in an environment with multiple storage systems, different automated systems will have to be purchased, upgraded or replaced.

There is also the core challenge of the hardware itself. First, automated tiering more than likely will mean a forklift upgrade of the storage system, since the current system probably does not support it. In many cases this may even mean a move to a new supplier, which means obvious capital outlay and the time investment of learning a new storage system.

Finally, because these systems are simply putting fast technology into older hardware, in many cases the shelves won’t be able to provide the needed I/O bandwidth for the SSD to reach its full potential. There’s also the concern of wasted space and power. Flash storage does not need to be the physical size of a drive nor does it need the power that a typical disk drive shelf power supply delivers. In many cases, thanks to the need for RAID redundancy and the form factor of the SSD technology, the high speed tier is much larger in capacity than it needs to be. Again, this is poor resource utilization in an area where maximum utilization is critical.

Basic Caching Appliances

An alternative to automated tiering is in-line caching leveraging SSD. This is similar to the cache memory that’s installed on every drive or created by most database applications, but much larger in size. Basic caching appliances are inserted, logically, in front of storage systems and use SSD for their cache storage. All data, or at least the data from certain assigned volumes, travel through that appliance. The appliance will use standard caching algorithms to understand what should be stored on its internal solid state cache and leave the least active data stored on the original mechanical disk storage system.

Caching appliances have their advantages. First they doesn’t need to copy data out of the cache, as the data is also on the mechanical drives. When data is promoted to the cache it’s not removed from the mechanical tier, the cache simply holds a copy. This also means that as data ages out of the high speed tier it does not need to be moved back down to the mechanical tier. It’s already there so the high speed copy can simply be discarded. Second, in most cases the cache can be implemented with no change to the application or user settings. It also requires very little change to the underlying hardware and certainly does not require a forklift upgrade. Also, caching systems will work across multiple storage systems in the environment.

Finally, since the appliance is designed to do just this function it can be tuned for more efficient use of memory (physical size and capacity) and I/O capabilities. The result is often that less solid state storage needs to be installed, as the cache has a smaller data center foot print and yet still achieves better performance.

The challenge with basic caching appliances is that they are file system (NAS) based, which can be unacceptable if the environment does not have a file system performance issue. The reality is that most databases and virtualized environments are still block storage (SAN) based. While block based systems are more likely to have performance issues they also have the architecture to take advantage of a high speed accelerator. Most databases and email systems are on block based storage systems, which are on low latency, high speed, dedicated storage networks. While using file systems for databases and virtualization has gained some popularity most are still used primarily for storing office productivity documents and would not see a performance improvement from a solid state cache.

Finally, most of these basic caching appliances are read-only or the vendor strongly recommends that they be in a read-only mode. They, like the other techniques, will not deliver a performance boost to the write operations in an environment. They will also have the same ‘warm up’ time issue, where it may be a few days after implementation until the performance of the cache is realized.

Advanced Caching Systems

The next step in solid state deployment, and potentially the most logical first step for many environments, is the use of an advanced caching system like those offered by Dataram. Their XcelaSAN product is a caching device that sits logically in front of the storage system but, unlike basic caching appliances, is able to be implemented with block devices. Similar to basic caching it also doesn’t suffer the data movement back and forth between tiers. However, because it is block based, it supports the applications and infrastructures most likely to need and most able take advantage of cache based systems. It also means that most file system based environments could still be supported as long as the NAS device accesses its storage through the SAN. A NetApp vFiler is an example of a NAS that can benefit from a block based, advanced caching system.

The second big difference with advanced caching is that it provides a performance boost to write, as well as read I/O. This means that the most expensive storage I/O task is now finally accelerated and that the advanced cache provides a performance boost as soon as it’s implemented; in other words no warm up time. Write caching also means that writes can be made to disk in a more consistent fashion, which optimizes the use of the mechanical array for those operations.

With write caching however comes additional responsibility. Since there may be a period of time that new data is only on the cache, advanced systems have to be designed with a mirror cache capability between two physically separate systems. In that way, if one of the systems fails data is still protected.

Advanced caching systems should receive strong consideration from organizations that need to boost storage performance. This technology allows for broad use of solid state storage across a wide range of applications and storage systems. Most importantly they provide a less expensive option to replacing an existing storage system.

Dataram is a client of Storage Switzerland

George Crump, Senior Analyst

Related Content

   Dataram Briefing Note