Traditionally, the storage capacity problem has been addressed by increasing drive densities. While this has allowed storage managers to stay just ahead of demand, the quantum leap has come thanks to a new way of doing things. Advanced technologies like data deduplication and in-line compression have allowed data centers to see an overnight, measurable increase in their ability to store data.


The other problem, storage I/O performance, will similarly be addressed by quantum leaps in technology. In fact, the need is greater with storage I/O performance because, unlike drive capacity, the per drive throughput of a mechanical disk has not improved at all in the last decade - nor is there anything on the horizon from the disk drive market that will change this.


Mechanical drive performance has been stagnant while everything else that surrounds the drive has become significantly faster. Storage controllers, storage interconnections, network interconnections and CPUs have all seen significant increases in performance. The performance improvement in these other areas has lead to more demanding applications which, of course, has lead to more demands on storage I/O.


This lack of performance improvement on the part of mechanical drives has required workarounds by storage system manufacturers and storage managers to keep up with the demands of the applications. The most common of which is to create large array groups with many drives in them. The more drive heads there are the more likely it is that performance will improve. For this to be the case, the application or users have to generate enough I/O requests to sustain the high drive count. If there are enough pending I/O requests then the major challenge to the high drive count RAID group approach is cost, due to the high quantity of drives and the cost to operate the system.


If the application or users cannot generate enough I/O requests then they must increase response time,or the time it takes to handle each request. The problem with mechanical drives is that once the drive is at 15K RPM, faster mechanics aren’t available. The storage manager is left with short stroking the drives which wastes capacity or adding solid state disk (SSD) which can be expensive. The challenge of course with these two options is cost. Short stroked drives are the most expensive drives formatted to 1/3 their normal capacity and SSDs, while becoming affordable, are still more expensive than mechanical drives.


One of the economic realities of high performance and high cost storage is that it should be kept as close to full as possible. Wasted space on storage that costs $1 per GB is a problem, but wasted space on storage that costs $15 per GB is a much bigger problem.


Additionally, the data on this storage should be the most active data in the environment. Typically, the problem is that data is moved to this storage one time and never touched again. The expense of this tier of storage demands the flexibility to move data in and out of it on an as-needed basis.


The other challenge with fixing application storage performance with these methods is that it is global. The entire storage tier, or a large part of it must be upgraded. Essentially, all the applications on that tier get a performance boost. Some vendors will even claim that this is a good occurrence, since all these other applications are getting a "free" upgrade. In storage performance there is no such thing as free. And in most cases, the applications getting the "free" upgrade would never actually show a performance gain because they would never be able to push the new storage configurations enough to use the improved throughput.


What is needed is a more granular approach to fixing application storage performance. This approach should focus on just the applications and the subsets of data within those applications that can actually benefit from a higher performance storage tier. Fixing all the storage to the benefit of a few applications that can actually take advantage of it leads to runaway storage costs and may be one of the reasons that so many data centers have significantly more primary storage than they need.


How can a storage manager start fixing application storage performance problems without having to constantly, manually, move data to and from an SSD or similar device? The reality is that they have neither the time nor the resources to accomplish this type of detailed task and some level of automation is required.


The groundwork is already in place for optimizing application storage performance. It’s called the cache. Using similar technology that is used in servers and disk arrays, a layer could be added to the storage infrastructure that will automatically move a copy of the application data into a higher speed storage tier. That tier could be populated with RAM or Flash based solid state disk drives.


This application accelerator needs to be more intelligent than a traditional cache; avoiding the pitfall of treating all data equally and simply dumped into fast storage. An application storage performance accelerator needs to understand how the data for given applications is being accessed and which data can be best served by a higher performing tier of storage. The storage manager also needs to have override ability so they can control which applications reside in cache, regardless of the data‘s profile, or usage characteristics.


Companies like Storspeed are taking these capabilities a step further by presenting the end users with a GUI that provides detailed analytics of application data access patterns. This provides the admin with a visual representation of what data should be used by the application accelerator and how that profile should change over time.


These devices or appliances are going to be placed in line, between the application server and the storage of the NAS head that they are going to accelerate. With their goal to speed up storage access, they need the ability to alter performance and scale as needed at a granular level. For example, the Storspeed appliances can be clustered together offering near-linear performance as nodes are added to the environment.


When optimizing for performance, every environment is unique. As a result the application storage performance solution needs to scale granularly as well. For example, the system should support the addition of nodes to improve inspection and cache processing times, network connections to improve bandwidth performance between the storage and the application and RAM to increase the size of the cache itself. This kind of flexibility is critical when trying to design a solution that fits the organization’s performance demands as well as its budget. Finally, the solution should be implemented with little or no change to the current environment.


An application storage performance accelerator like that offered by Storspeed delivers an automated means to address application performance issues and to do so more cost effectively than the traditional workarounds available to mechanical drives. A clustered application storage performance accelerator allows existing drives to be reconfigured and formatted to their full capacity for maximum utilization. In essence, this returns capacity to the environment and at the same time, dramatically improves performance.

George Crump, Senior Analyst

 
 Related Articles
 Which Automated Tiering Solution is Best?
../../2010/2/11_Which_Automated_Tiering_Solution_Is_Best.htmlshapeimage_2_link_0