What Your SAN Fabric Manager Software Isn't Telling You
What Your SAN Fabric Manager Software Isn't Telling You
Modern SAN switches all come with SAN fabric manager software. Most of these products include a basic monitoring capability that’s free with the switch and then, for an extra charge, offer for a more advanced module that provides greater reporting on what is going on inside the switch. Beyond that there are software and hardware solutions available from third party developers that can provide greater insight than these built-in tools. When does it make sense to leverage one of these solutions instead of counting on your SAN switch software?
The Shortfalls of Switch Fabric Management Software
The key component of any monitoring software is how often it gathers data about the environment that it’s managing. This process is called “polling”, in which the software collects details from the components (or “elements”) in the environment and then presents those details in a form that is easily readable by an administrator. Polling takes processing power to reach out to these components, gather the details, copy the data back to where the monitoring software is located and then assemble and store that data. Quite simply, the more often the monitoring software polls for data the more horsepower is needed to process all that data. This monitoring data is typically polled from the switch every 5 to 20 minutes.
The monitoring software that comes from the SAN switch vendors is no different. Like most dedicated monitoring solutions, switch vendors’ software has to poll at specific intervals to capture the information about what is going on inside their switches. But that’s where the similarities end. The storage switch is different than almost any other element to be managed in the environment. A switch handles a continuous flow of data and as we have seen with solid state storage and server virtualization, micro-seconds and milliseconds can make a difference.
A switch is essentially a purpose-built appliance that has a lot of I/O ports and is designed to move traffic through those ports as quickly as possible. Like any other appliance it has a processor that manages those transactions. This same processor also handles other routines, for example, monitoring the switch and reporting on its findings as well as provisioning and zoning. As the rate of this monitoring increases more processing power must be allocated to it, which means that less cycles are available for processing I/O traffic.
No switch vendor is going to allow I/O bandwidth to suffer; therefore, they sacrifice the frequency of polling intervals in order to maintain high I/O performance levels. If an error occurs, for example, in the storage network cabling or on the storage network itself that causes an I/O storm, the switch would have to compensate by applying additional processing power to the I/O ports. This leaves less cycles for the monitoring process, i.e. less frequent polling intervals, which ironically, cripples the very thing that could highlight the problem.
Finally, the storage network switch is not the ‘be-all and end-all’ of the storage network. While it’s true that the switch could be considered the center of the universe, servers, applications, HBA I/O cards and of course the attached storage can all be factors in causing storage problems, as well as degrading performance. Troubleshooting these problems really requires a holistic view of the entire storage infrastructure, something that SAN switch management software is unable to provide.
Dedicated SAN monitoring appliances
The other option to using the SAN monitoring software that’s provided by the switch vendor is to use a third-party monitoring application. These include software-only solutions and appliance-based solutions with hardware network taps that provide in-line real-time analysis, like those from Virtual Instruments. By taking a network perspective, these third party solutions provide a more holistic view of the storage environment.
Purpose-built hardware solutions are not dependent on the processing power of the SAN switch which may, as described above, be too busy to provide any effective analysis. Such solutions also tend to provide a more comprehensive view of the storage environment, again, something that SAN switch monitoring software typically lacks. The best solutions are ones that add no latency at all to the performance of applications.
The purpose-built SAN monitoring hardware solutions like those from Virtual Instruments have the ability to provide even greater capabilities. Because they are non-intrusive and provide out-of-band measurements and analysis, they capture information in real-time without affecting storage network performance. Monitoring products based on discrete polling intervals simply can’t do this.
The value of this approach is that it supports a more efficient and detailed analysis by being able to look at every transaction between every element in the storage infrastructure stack. Also, because these solutions are real-time in nature, they won’t miss performance spikes and/or bottlenecks that SAN switch software can easily miss.
For example, an application may experience a spike every 3 minutes that brings the storage network temporarily to a halt. But that spike may only last 30 seconds, which can be enough to off-line production applications or certainly interrupt critical processes. A SAN monitoring software that only polls every five or 20 minutes may never be taking data at the right time to detect this problem.
A real-time product that’s examining each fibre channel frame would immediately detect the problem when it occurs. In fact, in most cases, it will be able to forewarn the storage administrator of an impending issue prior to its interruption of critical applications or processes.
Summary
Monitoring software that comes with the SAN switch is typically priced to be cost-effective for basic switch diagnosis and troubleshooting. A third-party tool, especially one that’s hardware-based, can provide a broader and more rich set of analytics about the entire storage infrastructure, in real-time. Both have a role to play in making sure the SAN infrastructure is operating at peak levels.
As we have discussed, a third-party hardware based tool has the performance to be able to maintain real-time analytical processing. As we will cover in future articles this provides the foundation to allow the storage manager to better improve performance, better improve availability, and better utilize the existing infrastructure than if they were dependent on component specific tools like those provided by SAN switch vendors.
Virtual Instruments is a client of Storage Switzerland
Previous Entry: “The Hard Drive Shortage of 2012”
Thursday, December 8, 2011
George Crump, Senior Analyst