Why Quality of Service is Even More Important in a Virtual Environment

 

Tim Anderson


Quality of Service in the virtual World

As companies continue to drive increasingly more business critical applications into a virtual world, an area that tends to be overlooked is the Storage I/O stack. Many system administrators and architects focus mostly on the CPU, Memory, and Network portions of the virtual infrastructure. While these are areas that definitely need attention, with the movement of business critical applications into the virtual world, a deeper look and understanding of the I/O stack needs to be considered with just as much thought as the other areas. These business critical applications will almost certainly require a much more stringent Service Level Agreement (SLA).


In a typical virtual environment there are tools built to ensure that the CPU, Memory, and Network components stay within the prescribed parameters a system administrator determines them to be, or at a minimum the default settings that come with the virtual infrastructure. DRS from VMware is a perfect example of this process, using tools such as these provides a layer of QoS (defined as a way to reserve/allocate compute resources) for a given virtual environment. DRS from VMware provides the sys-admin the ability to reserve/allocate a given pool of resources (Memory, CPU, Network), ensuring that when the ESX server begins to become saturated, that the VM’s with a higher DRS priority receive the lions-share of the resources alloted to them.


The missing aspect of QoS in a virtual environment is down in the I/O stack, where the more virtual machines (VM’s) that are created the more abstracted those VM’s are from the actual Host Bus Adapters (HBA). As time passes and increasingly more VM’s are placed on a given physical server, the more likelihood there will be performance contention on the storage layer, with little to no way to detect which VM’s are consuming all the I/O. In the past a system administrator would then begin to develop workarounds by migrating VM’s around to other physical machines in an attempt to alleviate the problem.


For virtual environments to achieve greater scale and increased ROI a more elegant and manageable solution is required. The HBA’s that are deployed into a virtual infrastructure need to be more intelligent and allow the system administrators the ability to provide VM level storage QoS. Giving a VM a virtual (HBA) world-wide-name and a separate virtual channel to process and monitor all I/O requests will ensure that not only will the physical and the virtual environment have visibility into the I/O layer, but it will also provide the integration needed to meet the SLA’s of one critical application over another which may be development or test. Companies like Brocade are providing this functionality today in their 4/8gbps Host Bus Adapters.


In a typical fibre channel environment there is a possibility for head of line blocking. VM’s reading and writing data into slower disk subsystems could potentially bottleneck the VM’s using higher performing disk subsystems, especially if those VM’s share a common transport path. With the implementation of QOS and its isolation capabilities the VM’s can be tied to different QoS level to ensure they don’t impact each other’s I/O stream.

Features that an HBA and SAN Fabric need to obtain these levels of integration should include the following levels of QoS and associated benefits:


Hardware Integration (in the ASIC)


The process should be done in hardware as to ensure that the storage layer QoS isn’t competing with any of the valuable resources on the physical server hosting the VM’s. This will ensure a much quicker and adequate response to any QoS condition that may arise.

Enabling the QoS process inside the ASIC ensures that all of the functionality for QoS not only runs at line speed, but also ensures that a customer is not wasting any resources on the physical server itself. Overall allowing the server resources to be used for what they were intended for, running the applications. Typically QoS processes that run in software, need to have a server agent of some sort which in itself causes a management headache, especially when there are many servers that need this functionality.


Virtual Channel Buffer Credit Allocation


Providing the functionality inside the virtual channels via the dedicated allocation of buffer to buffer credits enables segregation between high, medium, and low levels of service. This isolation of frames across the various levels of service ensures that a given service layer can maintain it’s overall performance, without impacting the other virtual channels. Allowing the SLA to not only be met, but in most cases exceeded.


Traffic Isolation


Monitoring all the traffic patterns of every virtual HBA on every virtual machine ensures that when it is necessary to move VM’s from one physical server to another, the amount of cross-talk on the storage layer is minimized providing a more successful transition from one machine to another and limiting the amount of unnecessary downtime. Another benefit of traffic isolation is the enablement of mixed workloads virtual machines, or having the critical and non-critical VM’s on the same physical machine, ensuring that they do not impact one another’s I/O capabilities.


Simple and Centrally Managed


For any QoS measure to be effective, it should be straightforward and easy to deal with. Once QoS has been put into place, a de-facto setup should be deployed across all the virtual HBA’s in the SAN. From this point forward a customer could then make the determination which applications/VM’s require more and/or less performance, and assign the priorities as such, all again from a central management point. Central management also provides an ability to deploy firmware upgrades from a central repository versus moving from one server to another to apply patches and upgrades.


Increased Server Consolidation


With QoS deployed across a virtual landscape, a customer would now be able to drive more physical to virtual server conversion into their environment. Mainly provided by the features listed above, being able to more accurately track all aspects of the virtual environment gives the customer unsurpassed ability to accurately deliver the SLA for any given virtual machine.


Increased Security


Security and Isolation are other key benefits in deployment of these next-generation HBA’s. Since they have virtual world-wide-names (WWN’s), this means everything that a normal physical machine can take advantage of so can a virtual machine.  Zoning and fabric isolation as well as role based security for user authentication. Securing the fabric thru management access controls, device connections, and secure initiator to target communications are some of the extra benefits that can be realized with these features.


Significant Increase in ROI Opportunity


Increased budget efficiency is another key area that can receive benefits in deploying QoS in a virtual environment by reducing the number of HBAs in a customer’s environment, where today some may have four to six 4gbps HBA’s; these could alternately be replaced with just two 8gbps HBA’s. Doing so in-turn reduces the number of fibre ports on the SAN switches, as well as reducing the number of cables, SFP’s, and physical layer infrastructure components needed. These actions can lead to decreased power and cooling requirements.


Another notable benefit of QoS, is the ability for a customer to leverage even more of their virtual infrastructure. Applying these QoS practices to the storage layer enables not only the continued virtualization of non-critical applications, but the increased virtualization of critical applications which may not otherwise be added to the virtual environment without this unique QoS ability. Taking advantage of this functionality enables companies to continue to deepen their server consolidation efforts and cost savings.


Adding these savings up over a number of virtual infrastructures deployed at a customer location, its simple to see how the cost and management savings will have a significant positive impact on the customer’s environment.

As these QoS enabled HBA’s begin to proliferate into the market, its very important to keep in mind that most of the manufacturers developing these technologies need a switch or switches between the initiator and the target that support these advanced QoS features. This will ensure that a customer can take full advantage of the QoS feature set thru the fabric, providing an end-to-end solution.


Assume for a moment a customer is happily running there existing VMware environment with normal HBA’s that have no QoS ability. When a problem occurs in the storage layer like one VM taking most of the storage performance away from the others on that server. It would be extremely difficult for a sys-admin to determine which VM is actually the troublemaker, in this case system administrators will typically start shuffling the VM’s around from server to server to alleviate the problem. Now if this same customer had QoS enabled HBA’s, they would be able to determine very quickly and proactively (via configured notifications, and/or thresholds) which VM was causing the performance drain and take appropriate action.


As companies and technologies continue to push the envelope of the virtual world, it’s obvious to see that all the basic infrastructure components need to continue to advance to meet these challenges and open up additional opportunities for cost savings.  Capabilities like QoS in the storage layers of a virtual machine is an excellent step in the right direction for providing an open virtualized world that any application can take advantage of, regardless of it’s performance characteristics or uptime requirement.

Tuesday, March 3, 2009

 
 
Made on a Mac

next >

< previous