Managing VM Sprawl
Managing VM Sprawl
Server virtualization has been extremely successful. It has reduced physical server counts, increased business flexibility and made DR planning simpler. But server virtualization has also brought its own set of challenges, one of which is virtual machine (VM) sprawl. VM sprawl is the ‘weed-like’ growth in VMs that, similar to ‘NT server sprawl’ a decade ago, has become a management problem for IT administrators everywhere.
Friday, February 5, 2010
VM sprawl is caused by the ease with which new servers can be deployed, as a result of server virtualization. In a virtualized server environment, thanks to templates, new VMs can be created and fully configured in a few minutes. This compares to physical servers where there was a real-world, logistical limit on how fast they could be implemented. These servers had to be purchased, delivered, manually installed into racks, configured and connected to the data center networks before the implementation process was complete.
Because of this ease of deployment virtual servers are routinely ‘stood up’, almost as soon as they’re requested. It seems that this happens without much thought being given to how important the application is or the length of time the VMs need to be deployed. There are cases of VM growth rates approaching 125% per year, with the majority of those VMs being servers that never existed before the switch to virtualization.
Identifying orphaned VMs should be the first step in getting VM sprawl under control. These are server instances that had been set up for a specific purpose, but outlived their usefulness quickly and so were abandoned. For example, a request may come in for a VM to test a new version of an application. The server is only needed for about 30 days, but after the testing is done, it just sits idly - another orphaned server, with no task to perform.
What is the cost of an orphaned VM?
The truth is that orphaned VMs are not really idle. They’re still consuming memory and CPU cycles and burdening the hypervisor to continually check-in to see if the VM needs additional resources. And, they’re consuming disk resources, which, can be quite high, thanks to the practice of using templates to make the set up easier. Most administrators set a “safe” file size in their VM templates to make sure there’s always enough disk capacity. It’s very likely that idle VMs can be tying up TBs of excess disk space in a typical environment. Most server virtualization environments have made the extra investment and deployed shared storage for VM flexibility which means this wasted capacity is coming at a premium price.
Orphaned VMs also unnecessarily add to the cost and complexity of the data protection processes. The data associated with these orphaned systems is often included as part of a default replication strategy which takes disk space at a DR site. These orphaned VMs also consume backup resources, as they’re saved when full backups are executed and examined during each incremental backup, to confirm that no changes have occurred since the last backup. Orphaned VMs can also have an impact on the performance of other VMs on the same server, so it is critical that administrators keep track all of the VMs sharing and drawing on the same resources.
Compared to a physical server, the effort required to identify, turn off and archive an orphaned virtual machine is minimal. Physical machines need to be powered off, de-racked and physically stored or securely discarded. If suddenly an application needs to be regenerated the physical deployment has to occur all over again. A virtual system can be turned off and the virtual machine image can be archived to less expensive storage. Returning to operation requires only a few clicks and the time to do a disk to disk transfer.
Identification
The first step is to identify these virtual machines and archive them out of the environment, or at least turn them off. A monitoring tool like Vizioncore’s vFoglight can provide data on resource utilization, template efficiency and deployment strategies. These tools will monitor from a virtual machine view, a vCenter view or a data center view, essential to detecting virtual machines that are inactive. They will also allow the close monitoring of specific resources that can provide additional clues to identifying orphaned VMs. The ability to examine a VM over the course of time is critical. Low memory and CPU utilization for one night does not justify the decommissioning of a VM, but over the course of a few weeks, it likely does.
Archive
Once the orphaned VMs are identified they can then be dealt with. For VMs that will likely see resumed use, simply tag and turn them off. This is a key advantage over physical systems. If there are applications in the environment that are only run quarterly, for example, it’s easy to turn them off and on as needed. Physical systems require physical interaction and typically are not used on an as-needed basis like this.
VMs that are deemed highly unlikely to be needed in the future can be archived to a secondary disk tier that’s lower in cost per GB and more power efficient. Using tools like Vizioncore’s vRanger, the archived VMs can be recalled with a view clicks. This provides the ability to free up all the disk resources discussed earlier and store the server in a secure state, in case there’s a need to show chain of custody in a legal action.
Control and Automation
Once the existing environment has been cleared of orphaned systems, the next step is to put procedures in place to keep VM sprawl from happening in the future. With products like Vizioncore’s vControl and the public domain scripting capabilities of VESI the whole process can be automated. For example, more granular use of templates can be instituted. During their creation the administrator can be prompted for the needed VM disk size to keep utilization efficient. They can also provide VM expiration dates and the name of the VM requester. This information can be embedded into the notes section of the VM. A subsequent task could then check for expired VMs and email the requester for authorization to turn it off. A final task could then be run which turns off all expired and confirmed VMs. Especially in server virtualization, this kind of broad automation is critical to enable system administrators to increase the amount of VMs that they can manage.
Effective management of VM sprawl is enabled by having the right tools. Some of these capabilities exist within the server virtualization software, but need the help of automation tools to allow administrators to take full advantage of them. For exacting control however, third-party programs that can monitor and archive these virtual machines are required. Through this combination of internal utilities and external software tools, the ‘great VM sprawl challenge’ can be managed and the ROI on server virtualization projects increased.
George Crump, Senior Analyst
This Article Sponsored by Vizioncore