Which Automated Tiering Solution Is Best?
Which Automated Tiering Solution Is Best?
Automated Tiering Solutions (ATS) provide the ability to dynamically move active data to a faster tier of storage; most commonly DRAM or Flash based Solid State Disk (SSD). The concept is to provide faster performance across a broad range of applications regardless of the original back-end storage. This market is maturing quickly with new organizations offering different solutions. Moreover, automated tiering is also being added as a capability to traditional manufacturer’s storage offerings. As IT professionals examine this market they may struggle trying to determine which solution is best for their environments.
Thursday, February 11, 2010
Automated tiering solutions typically leverage cache-like algorithms to move data to and from high speed DRAM storage and apply this performance boost across a wide variety of workloads. This differs from the traditional solid state disk system where SSD drives are captive to individual servers and the high performance capabilities are dedicated to specific workloads on those servers.
The first step is to decide when to deploy an ATS. ATS solutions are ideally suited for application and workload mix that vary in performance requirements, infrastructures distributed across multiple storage devices and environments with large numbers of clients. If there is a workload or application with a very specific dataset that consistently needs high performance and is localized to a single server then a traditional SSD solution might be more applicable.
Today there are two types of dynamic tiering solutions emerging; caching ATS and persistent ATS. Caching ATS solutions are similar to the architecture of traditional cache deployments. These solutions live in the network transparently between clients and back-end storage and extend caching capabilities by leveraging significantly more available capacity to store ‘hot’, or frequently accessed data. They may also, as is the case with Storspeed, provide a level of intelligence and management capabilities heretofore unavailable in file-based storage infrastructures. Enhanced management software shows how the ATS is being utilized, provides recommendations on what application data should be using the ATS and allows manual overrides to lock out specific data sets. These systems are a cache in every sense of the word. Data remains in the cache area only as long as needed and the actual mechanical drives in the back-end file-base storage are quickly updated as data changes.
Since the systems are caches to the mechanical storage there are virtually no changes in storage access or storage management procedures. No changes are needed for user logins or application access, nor are any changes required to data protection procedures. Cache ATS solutions generally combine DRAM and Flash/SSD technologies with intelligence to optimize application performance regardless of the type of back-end storage. Essentially, this form of ATS is designed to enhance the current environment, not replace it.
Sitting in-line between clients and servers, cache ATS solutions must provide the same or better availability characteristics as the storage infrastructure. These solutions offer a broad range of options from component redundancy, cluster-based resiliency, network redundancy, and in the case of Storspeed’s solution the ability to bypass the cache ATS solution altogether, insuring client-storage access at all times.
Persistent ATS differs from a network-resident cache ATS solution in general by providing a permanent performance tier. Persistent ATS are new mount points in the storage infrastructure that sit in front of the original back-end storage. They gradually trickle modified cached data to the original disk-based storage system as needed. Data resides in the ATS for a significantly longer period of time. These systems are designed to become a superset to the current storage system, relegating it to a less active role. Internally, persistent ATS method uses a combination of tiers, often including DRAM, Flash SSD and SAS based mechanical hard drives. The fourth tier is the legacy NAS devices that it supersedes.
With the persistent ATS method, data is migrated through these different tiers based on access history. As data changes the original storage system is updated but in an asynchronous fashion, so as to not affect overall performance of the ATS. This will lead to some time lag between the latest data on the ATS and the original storage system.
The persistent ATS method has one challenge; all data must be explicitly routed to be processed through it. Users and applications need to have their paths modified to be able to speak directly to the ATS. Backup methods don’t change, the original storage tier is backed up as normal, and the ATS can be programmed to flush its cache prior to the backup window kicking off.
This change requirement also means that if the persistent ATS fails then all the users and applications must be modified again to access data directly from the original point, unless there is some form of high availability option available to it. The work involved in returning to operations if there is a failure makes high availability a critical component of the ATS offering and may drive up the cost of the system.
The two decision points when selecting an ATS are to first determine if the data set would lend itself to the strengths of an ATS. These include a broad implementation across a variety of applications, large numbers of clients and varying times when peak I/O is needed per application. The second is to decide which type of ATS is needed. If the goal is greater performance and visibility into your current storage infrastructure without disruption then the cache based ATS should be examined. If the goal is to essentially replace your current storage subsystem then persistent ATS should be considered as well.
George Crump, Senior Analyst
This Article Sponsored by Storspeed