Managing VM Sprawl With Disk Archive
Managing VM Sprawl With Disk Archive
Virtual Machine Sprawl is a top challenge for many data centers that are starting to see their virtualization projects bear fruit. It really does work and administrators can now deploy VMs in a matter of minutes. VM sprawl is a byproduct of virtualization's success. How can VM sprawl be controlled? Managing VM sprawl with a disk archive is a logical first step in curtailing out of control VM growth.
VM sprawl happens innocently enough. An application owner goes to the VM administrator and asks for a new server for 30 days of testing. The VM administrator, eager to show the flexibility of the infrastructure, quickly clicks through the template to create the virtual machine. Thirty days comes and goes, the application owner is done with the system but never notifies the VM administrator to delete it. Or, they would rather not have it deleted "just in case" they need it again. The VM administrator, busy with other tasks, has forgotten about the temporary VM. Meanwhile, the storage administrator, who has not been a part of any of the conversations, can't figure out why the VM administrator keeps going through disk space so fast.
Wednesday, October 28, 2009
The result of all the speed, flexibility and lack of communication is that even in medium sized data centers 100s of virtual machines are sitting idly by, chewing up compute and disk resources. This is because an idle virtual machine is not really idle, it consumes attention from the virtualization software's hypervisor, a time slice of the CPU, storage I/O and network I/O.
It also consumes physical disk capacity that the creation template allocated to it. There is the actual data that the VM uses plus the free space that has been hard allocated to it. The result can be double digit TBs of wasted tier-one, top-dollar primary storage.
When physical servers were the order of the day, there was not much that could be done to manage the sprawl, other than try to make the servers smaller, 1U or blades. There was a hesitancy to repurpose the server for another task, because if the application owner needed that server it would have to be reloaded with the operating system, application and data. There was also the reality that the server had been charged back to the application owner in its entirety and it was, in effect, not the admin’s resource. It was, however, the admin’s problem to store, cool and protect it. But they had little influence over redeployment.
With virtual machines the scenario changes, since virtual machines are servers encapsulated to a disk file. If that server image can be moved to a storage platform that is less expensive, more secure and more scalable than primary storage, then movement of that server image is something to consider. A disk based archive, like those offered by Permabit Technology, does exactly that. It allows for server images that virtual infrastructures create, to be moved in and out of the virtual infrastructure as needed.
With an archive storage foundation in place, the VM administrator can inspect VMs with limited activity and then examine those VMs for lack of login activity. Once those are found the VM admin could migrate those VMs to the disk based archive using utilities like vRanger from Vizioncore. Additionally, if automated provisioning tools, like Vizioncore's vControl, are used the VM can be tagged with an expiration date, making scanning for the VM archive candidates even easier. Since the disk archive provides rapid access to those VMs, they can also be re-migrated or restored back into the environment as needed.
This makes the example of the testing VM that was discussed earlier a significantly easier process to manage. The VM could be provisioned and tagged with a check-in or expiration date. Then, when that date arrives, emails could be sent to the VM's owners, automatically informing them that the VM is about to be archived. The VM admin could then archive the machine to the disk archive platform, but have it readily available for rapid restore when the need for testing occurs again.
The positive impact of implementing an archive storage tier for the virtual infrastructure is significant. First, expensive primary storage that is being consumed by inactive VMs is returned to the infrastructure. This can delay the hard cost of purchasing additional storage, as well as the soft costs of implementing the additional primary storage and repositioning VM images. Additionally, implementing a VM archive strategy may also improve compute, storage and network I/O performance as each VM, idle or not, does require attention from the hypervisor or consume the resources under its control.
The archive tier is typically made up of higher capacity, lower cost SATA drives, making the price delta between primary storage and archive storage significant and compelling. The archive tier would compress and deduplicate the images it receives. This is especially critical in VM archiving since there is a high level of redundant data between server images. Deduplication and compression reduces the effective costs and further widens the price delta even more dramatically between primary and secondary storage, making the ROI on the project higher and its achievement sooner.
The archive storage tier also provides a secondary backup of the VMs. In fact, using block-level incremental backup of the VMs, the archive target typically has more than enough network I/O performance to function as the primary backup target. In addition, the archive tier should have the ability to replicate data, leveraging deduplication and providing WAN optimized data movement, ideal for disaster recovery sites.
Disk archive technologies like Permabit's have the ability to scale to multiple PBs in size. Combined with the storage efficiencies of deduplication and compression, means there is minimal concern of not being able to scale to meet the growing demands of rapid information growth that would end up in an archive and also the entire backup process.
Since the archive tier is often presented as a standard network mount point, interaction with the system is simple and something that admins already know how to do. The software applications that provide archiving and block level backups also support network attached disk. This simplicity provides the system administrator with a wide range of options when choosing how they want to archive their environment.
VM sprawl can be easily addressed with the right storage platform and software to move data to that platform. The advantage of implementing a disk archive is that it not only solves much of the VM sprawl issue it also adds additional data protection capabilities that are often sorely lacking in the virtual infrastructure.
George Crump, Senior Analyst
Related Articles
Faster Primary Storage with Data Dedupe
Primary Storage Deduplication, Demand It
Dedupe Improves Primary Storage Efficiency
SMB NAS is Deduplication's Next Step
Primary Storage Dedupe Addresses Data Gap
How Should Primary Storage Be Delivered
Storage Industry Consolidation & Dedupe
Primary Storage: Dedupe vs. Compression
High Performance Primary Storage Dedupe
Automated Tiering or Disk Archiving?
Global Healthcare Leader - Disk Archive
Optimization - New Normal in Storage
Can’t Deduplicate Admin Workload
The Foundation of Dedupe’s Era
Weaknesses of Dedupe - Retention
Permabit is a client of Storage Switzerland