Although relatively low in capacity, solid state storage provides extremely high input/output per second (IOPS) performance that can potentially solve most storage I/O related challenges in the modern data center today. The sticking point is how to best tap into this vast performance resource. The options are to permanently locate data on the hard disk device and use solid state storage as a big cache, or to use automated tiering to promote data from hard drives to the SSD tier as needs warrant. Vendors like EMC, NetApp and Dell are all attempting to develop solutions to bridge their customers to SSD but as this article will explain each of those solutions has its limitations.



Fixed Placement


Fixed placement to solid state storage may be acceptable for certain workloads where specific subsets of data can be placed on SSDs, database application hot files (eg. indexes, aggregate tables, materialized views) being good examples. It’s more difficult to use these types of devices more broadly though, because most do not support a full complement of storage services (snapshots, replication, etc.) and many don't have complete high availability options. Or the cost to implement high availability is simply too high -- it effectively doubles the price.



Automated Tiering


The storage vendor response to fixed placement has been automated tiering, which moves sections of data as they become active to high performance storage and then demotes them as they become less active. While this strategy is effective and allows for a broad use of SSD, it does have its own set of problems. First, the storage system must support automated tiering. For many customers this is going to mean an upgrade to a new system, which tends to be in the upper price band for storage systems. Second, the size of the sections of data to be promoted, depending on the vendor, can be quite large, up to 100MB. This constant reading and writing of large chunks of data to the high performance tier increases issues with Flash SSD wear-out, causing the SSD drive to fail prematurely due to write amplification. It also creates a substantial increase in internal I/O that the storage controller has to migrate, often causing spikes in controller CPU and memory resource usage that could otherwise be dedicated for other activities. Third, the time it takes for the storage controller to analyze data access patterns and start promoting data to the SSD tier can delay the time to ROI by days or weeks. Finally, all of these systems require that data first be written to the mechanical hard drive tier and then promoted up to the solid state tier. Writes are not placed directly on SSD, and as a result, the most resource intensive storage operation, writes, doesn’t end up benefiting from automated tiering.



Cache Appliances


To alleviate some of these issues several third party manufacturers have created external caching appliances. These systems sit inline between the servers accessing storage and the storage itself. In other words all traffic must flow through the devices. As data is read from the disk systems it’s written to SSD cache and then served to the accessing clients. Like any other read cache, the active data is left in the solid state storage area. If it’s accessed again by the same or another client, read performance will be accelerated. This does create a broad caching tier for the environment providing a high performance boost to more storage, but it may be too broad, since all data going through these devices needs to be examined for cache appropriateness.


Inline caching like this is very difficult to do in a block I/O network and as a result, most cache systems on the market today are designed for network attached storage (NAS) and specifically NFS based storage. This limits their usefulness to those environments. Because of their inline nature solid state caches also are vulnerable to the performance limitations of the storage network and the storage controller. Finally, the inline caching appliance itself can become a limiter to scale and be overrun when many application servers are channelling storage I/O through the caching appliance.



The Universal Bottleneck


All of these solutions, fixed placement, automated tiering and cache appliances suffer from a universal bottleneck of the storage controller and the storage network. None of these solutions improve the performance of the storage network or the storage controller, in fact they often expose its shortcomings and force an upgrade to achieve the maximum benefit to the SSD investment. They also ignore the fact that the device needing access to the storage I/O performance boost is the application server or virtual host. To solve this problem some vendors have begun to deliver server based solid state caching technology.



What is Server Based Solid State Disk Caching


Server based solid state disk caching takes the concepts of the cache appliance and moves them into the server, typically via a PCIe card. This provides several significant advantages. First, the problem is being fixed closer to the source (the application or hypervisor) and cached I/O does not need to traverse the storage network. Second, this allows for a very discrete application of the technology as not every server in the environment needs or can even take advantage of the accelerated storage performance that SSD brings. Instead of deploying something universally to solve a specific problem it’s usually best and most cost-effective to deploy something discreetly, only where the problem exists. This also provides a scale out capability to storage I/O performance. Each server gets its own SSD cache and there’s no bottleneck down the line in contrast with the appliance model. Server based SSD caching almost exactly follows the scale out model that is common in virtualized data centers and enterprise cloud environments.


The key limiter of server based solid state disk caching thus far is that it’s very protocol specific, and can only work with the storage it’s connected to. Typically this consists of direct attached storage which does not match where the storage resides in the larger data center, at the end of a storage network. To solve this problem companies like Marvell with its DragonFly Virtual Storage Accelerator are developing solutions that are protocol agnostic and work with local storage, IP based storage and FC based storage.



Write Acceleration


Another key shortcoming to most of these approaches is their inability to accelerate random write performance, again the most expensive I/O from a resource perspective that the server and storage is involved with. There is good reason for this too. Making sure that writes are safely protected requires a significant investment in resources and flash based SSD is significantly slower at writes than reads. The big problem is that if there is a failure while a write is still in the cache the data is in an inconsistent and potentially corruptible state. That’s why main memory DRAM can only be used for application read caching – writes would be incredibly dangerous since a power loss event would result in instant data loss and corruption. Equally concerning is the “write cliff” limitation facing SSDs whereby random write performance drops off a cliff (often by 30-80%) fairly quickly due to garbage collection. The only current method to alleviate this is to over-provision SSDs with significantly more capacity to deal with write amplification, but that’s an incredibly expensive work-around. Since most workloads in the real world are a sizeable mix across reads and writes, a balanced approach to delivering consistently high write based caching is needed to complement traditional read caching.


Marvell's DragonFly VSA is one of the few devices that accelerates writes as well as reads by leveraging up to 8GB of DDR3 DRAM as an L1 write-back cache that front-ends commodity SSDs (an L2 write cache). In the case of a power failure the Marvell VSA uses a capacitor to keep the RAM memory charged until it can be flushed to dedicated onboard flash memory. Additional layers of protection can be used if need be to provide protection from motherboard failure. Marvell’s implementation of intelligent write acceleration is also very attractive since it not only speeds up writes, but allows for those writes to be coalesced and pruned for better network utilization and more efficient disk operation. Finally, the boards provide an API that enables applications and OSs to force a cache flush when they need a consistent data set on physical storage for a snapshot or backup process.



Summary


Solid state storage answered the question of how to solve storage I/O performance problems but it introduced another series of questions about how to best utilize them. Caching technology can be a good answer to those questions but introduces yet another question, one involving location of that cache. For NFS heavy workloads, supported by a proprietary NAS, a network based cache may make sense. But for block I/O applications, especially when a variety of protocols need to be supported, server based caching becomes more attractive. To really provide an impact the key is to address the safety of caching write I/O like Marvell has done, because it begins to deliver ROI on day one, there is no warm up time. Other solutions that do not have write cache cut their performance gain opportunity by 30% to 90% depending on the percentage of I/O that is writes vs. reads. Marvell’s DragonFly approach gets it right by addressing both the random read and write cache at the same time, aligning with real-world workload challenges that IT data center managers deal with every day.

George Crump, Senior Analyst

Marvell is a client of Storage Switzerland

 

Related Content

  Server Based Storage Acceleration