The reason for initial adoption and positive acceptance of server virtualization is that in the initial phases, challenges with storage management and data protection are not as visible as in the later stages, when the environment begins to scale and more applications depend on the virtual infrastructure. As the environment scales, more hosts are added, the virtual machine density per host increases, the overall VM population increases and the storage environment becomes more complex. The issues around managing the shared storage for this environment forces many IT managers to slow down the server virtualization rollout and take another look at their storage infrastructures.



The Complexity of ‘Shared Everything’


The first problem that most IT managers will encounter is that sharing storage makes any environment more complex, but server virtualization needs shared storage to enable key capabilities. So, most server virtualization infrastructures move to shared storage, if not immediately, certainly soon after production rollout. Most advanced capabilities offered by today’s virtualization software, like server migration, resource management and automated DR recovery, require a shared storage infrastructure.


Shared storage is of course nothing new; storage area networking (SAN) has been around for over a decade and Network Attached Storage (NAS) even longer. It’s in the way a server virtualization environment uses its shared infrastructure that the complexity is introduced. A traditional SAN, where the servers connecting to it have only one application installed on them, are not required to share the volumes assigned to them on the SAN with any other server. In fact, that’s the preferred configuration. A server that is going to act as a virtualization host, on the other hand, will have multiple applications (usually in multiple VMs) on it and be required to share access to its volumes with other physical hosts within the infrastructure. In a fibre SAN environment this is managed by a clustered file system, in the VMware case with VMFS. While VMware has made working with a clustered SAN file system a task ‘attainable by mere mortals’ it does require that the fairly sophisticated SAN infrastructure be properly set up.


Initially, fibre channel SANs were the only supported method to build the shared storage infrastructure that VMware requires. With release 3.0 VMware added support for iSCSI and, potentially more interesting, NAS, via an NFS share, to support VM images. iSCSI drives out some costs compared to fibre and puts block storage onto IP, an infrastructure that many more IT administrators are familiar with, compared to fibre. However, it also introduces much of the same shared storage infrastructure complexity when it comes to supporting a clustered file system.


Like iSCSI, NAS via NFS drives out much of the expense associated with creating a shared storage environment and may potentially be even more cost effective. With iSCSI, improving performance and stability may require a move to specialized iSCSI cards in the virtualization hosts. NAS may, at most, require a dedicated but standard ethernet NIC to enhance performance and to provide boot-from-SAN functionality. The specialized iSCSI card may also be 25% more than the cost of a standard ethernet NIC. 


While NAS and iSCSI both maintain the simplicity and ‘comfort level‘ many get from the more familiar IP environment, NAS extends this advantage even further. Most IT administrators have significantly more experience setting up a shared NAS or file server environment than they do setting up a shared, clustered, block-storage environment. In the NAS implementation case VMware images are shared between server hosts just like any other files would be shared between those hosts. And virtual machines are all the components of a traditional server instance encapsulated into those files. Essentially the shared storage in this virtualized environment is simply a file server, something that a NAS has provided for decades.



The Complexity of The I/O Blender


In legacy, single-application / single-server environments, an application is communicating through the server for all storage I/O requests. Efforts to measure, monitor and maintain performance can all focus on that single application. If more performance is needed, one simply adds faster or more NICs or faster storage to that server. If the current storage is not fast enough, adding storage to just that server is often acceptable. Rarely does an application overreach the capabilities of that particular server. In a virtual environment each host can have dozens of VMs running unique applications. Each of these VMs can make storage I/O requests at any time. It’s an inherently random environment, hence the term I/O blender.


While both fibre channel and NAS environments have plenty of fine tuning capabilities to help make sure that individual VMs get the performance they require, this means constant monitoring and fine tuning. While in very large virtual infrastructures, this customization of performance at a VM level may be an eventuality, it’s something that should ideally be put off as long as possible and done as infrequently as possible. The simplest way to accomplish this is a fast storage system that can handle the inbound performance requests without special customization. The challenge is as virtual infrastructures grow they demand more and more capacity. In traditional dual controller storage systems each addition of capacity puts more performance drain on those controllers. Gradually, performance decays to the point that either a larger system is needed or an additional system needs to be purchased and somehow integrated into the storage environment. In both situations complexity and costs increase. This eventually leads to multiple stand-alone SANs or NAS controller heads all having to be individually managed, yet set up to be shared between the hosts. It makes something that is already complex even more so.


The simplest way to address this is with a storage system that scales linearly as capacity is added to the system. Implementing virtualized environments may present an ideal time to apply clustered storage systems. As capacity is added to a scale-out storage system, increased processing and storage I/O comes along with that addition, meaning performance improves as the environment scales. Potentially more important is that since there continues to be a single storage system with a single file system, complexity does not increase. Managing one large storage system is always easier than managing five or six smaller ones. This is especially true in the ‘shared everything’ world of server virtualization.


The complexity of VMware storage can be reduced. It requires understanding of the dynamic nature of the virtual environment and the reality of how it shares everything. Storage systems that simplify these types of challenges are ideal ways to begin to address the problem. The ideal solution is to address them both in one storage system. A top consideration should be scale-out storage solutions like those offered by Isilon which brings the simplicity of NAS to a scale-out clustered architecture.

George Crump, Senior Analyst

Isilon Systems is a client of Storage Switzerland