Confirming a Storage I/O Problem


The first step is to validate that there really is a storage performance problem. When deciding if there is a storage bottleneck it makes sense to first look at overall CPU utilization within the compute infrastructure. If utilization is relatively low (below 40%), then the compute environment is spending 60% of its time waiting on something. What it is typically waiting on is storage I/O.


To confirm the existence of a storage bottleneck, system utilities like PERFMON, provide metrics which offer insight into disk I/O bandwidth. If there is not much variance between peek and average bandwidth, then storage is likely the bottleneck. On the other hand, if there is a significant variance between peek and average disk bandwidth utilization, but CPU utilization is still low, as outlined above, then this is a classic sign of a network bottleneck.


In the legacy model of a single application to a single server environment, the first step is to add disk drives to the array and build RAID groups with a high population of drive spindles. The problem is that a lone application on a single server can only generate so many disk i/o requests simultaneously and to perform optimally, each drive needs to be actively servicing a request. In fact, most manufacturers recommend that an application generate two requests per drive in the RAID group to insure effective spindle utilization. As long as there are more simultaneous requests than there are drives, adding more drives will scale performance until you saturate the storage system controller or NAS head.


The legacy model, in most cases, can't generate enough I/O requests to feed drive mechanisms and saturate the performance capabilities of the controller or NAS head. In fact, the traditional storage manufacturers are counting on the legacy model, because any other scenario exposes a tremendous performance-scaling problem with their systems.



Storage Performance Demands of the Modern Data Center


The modern data center no longer resembles the legacy model. In general, the modern data center consists mainly of high performance virtualization servers that participate in what is effectively a large compute cluster for applications. Each of the hosts within these clusters has the potential to house multiple virtual servers; each with its own storage I/O demands. What initially began as servers merely hosting 5 to 10 virtual machines now have the potential to hold 20 to 30 virtual machines per server.


That is, 30 very random workloads per host in the virtualized cluster, easily scaling up to 300+ workloads and 1,000 plus virtual machines are not uncommon, driving multiple storage I/O requests. Consequently, current VM environments can easily demand high drive counts within storage arrays. This significantly heightens the risk of saturating the controllers or heads of current storage platforms.


Furthermore, in many environments there are specialized applications that are the inverse of virtualization--a single application is run across multiple servers in parallel. These applications are not limited to the commonly cited example of simulation jobs similar to those found in chip design or processing SEG-Y data in the energy sector. There are many others; DNA sequencing in bioinformatics, engine and propulsion testing in manufacturing, surveillance image processing in government, high-definition video in media and entertainment as well as much of the Web 2.0 projects.


Many if not most companies now have one or multiple applications that fall into one of the above categories. As is the case with server virtualization, these applications can create hundreds if not thousands of storage I/O requests. Once a high enough concentration of disk drives is configured in an array, the limiting performance factor shifts from drive resource availability to a limit on the number of i/o requests that can be effectively managed at the controller level.



The Key Issues


As a result, the modern data center now faces two key issues. First, because of the high storage I/O request nature of these environments, while adding drive count to the system does scale performance, there is often a limitation on the size of the file system. This limitation forces the use of very low capacity drives which increases the expense of the system considerably. Alternatively, if drive capacities that strike a better price for capacity ratio are used, the file system size limitation also limits the population of drives that can be added to a file system assigned to a particular workload, thus limiting performance potential.


Neither option is ideal. The challenge is that storage systems have to continually evolve to provide ever higher drive counts and capacities to achieve efficiencies that make sense. Likewise, file systems must be able to leverage increasingly higher disk drive counts in order to play into the efficiencies of that model. Regardless, higher drive counts will eventually lead to saturation of the storage controller or NAS head. This is the reason that there is typically a performance bell curve on storage system specification. While a given storage system may support 100 drives, it may reach its peak performance at only half it’s published capacity --50 drives.



Using Legacy Storage to Address a Modern Challenge


As discussed, throwing drive mechanisms at the problem quickly exposes a more difficult bottleneck to address;--the storage controller or NAS head. The traditional solution to this bottleneck was to add increasingly more horsepower in a single chassis. If it was eventually going to require a V12 engine to fix the storage controller, buy the V12. With the equivalent of a V12 storage controller or NAS head engine in place, disk drive count would have a chance to scale to keep up with storage I/O requests. This V12 storage engine often has additional storage I/O network ports connecting the SAN or NAS to the storage infrastructure.


There are several flaws with this technique. First, especially in a new environment or for a new project, there may not be a need for that much processing power upfront. If all that is required is a V4, why pay for a V12 now? This is especially true for technology where the cost of additional compute power will decrease dramatically in price over the next couple of years. In essence purchasing tomorrow’s compute demand at today's prices results in a dramatic waste of capital resources.


Second, it is likely that as the business grows and the benefits to revenue creation and CAPEX controls of server virtualization or compute clustering are realized, there will be a need to scale well beyond the current limitations of the V12 example. The problem is there is no way to simply add two more cylinders to a V12. Instead, a whole new engine must be purchased.


This requirement may come from the need to support additional virtualization hosts with even denser virtual machines or increased parallelism from a compute grid. It could also come from the storage controller or NAS head being saturated by the number of drives it has to send and receive requests to. Unfortunately, the upgrade for most storage systems is not granular. If all that is needed is more inbound storage connectivity, the whole engine must be thrown out.


This has a dramatic impact well beyond the additional cost of a new engine. Now work must all but stop while decisions are made as to what data to migrate, when to start the migration and then wait for the migration to complete. This, especially in some of the simulation environments where data sets can be measured in the hundreds of TB's, could take weeks if not longer to migrate.


Finally, even if the organization could cost justify the purchase of a V12 storage engine, rationalize the eventual need to buy a V16 in the future and deal with the associated migration challenges, there is still the underling problem of file system size. Most file systems are limited to a range of 8 to 16TB's. While some have expanded beyond that they do so at the risk of lower performance expectations.


As stated earlier, the impact of a limited file system size is felt both by the inherent limit plus by the limited number of spindles that can be configured per file system. Again if higher capacity but similar performing drives are purchased it no longer takes many drives to reach the capacity limitations of a file system.


File system limitation also impacts the speed at which new capacity can be brought online and made available to users. While many storage solutions feature live, non-disruptive disk storage capacity upgrades and some even allow for uninterrupted expansion of existing LUNs or file systems, these niceties break when a file system

is at its limits.


When a file system is at its maximum size, the new capacity has to be assigned to a new file system. Then time has to be spent deciding which data to migrate to the new file system. In this instance, down time often is incurred while the job move takes place.


One potential work around for limited file size is virtualized file systems that logically sew two disparate file systems together, even when the components of the file systems are on different storage controller heads. While these solutions work well to help with file system and storage management, they do little to address storage performance challenges. This is because the level of granularity is typically at a folder level or lower. As a result, the individual heads or storage controllers can not simultaneously provide assistance to a hot area of a particular file system and once again the single head or controller becomes the bottleneck.


As stated in our earlier article NAS is a preferred platform for these applications but many customers look to a SAN to address some of the performance problem. Reality is that both storage types have the similar limitation of being bottlenecked at both the data path going into the storage controller or NAS head or being bottlenecked at the processing capability of the controller/head itself.



The Answer is in front of us


The answer for solving the storage I/O problem is to leverage the same technology that moved the bottleneck to the storage in the first place. Scale-out the storage environment similarly to the infrastructure now common in the compute layer. By developing a clustered approach similar to Isilon's Scale-out NAS, infrastructure drive spindles can be added to match the number of requests by the compute platform on a pay as you grow basis without worrying about hitting the performance wall of legacy storage systems.


In our next article we will discuss how scale-out NAS may be the ideal storage solution for the modern scale-out data center.