Using Clones To Manage VMware Storage Growth
Using Clones To Manage VMware Storage Growth
VMware server virtualization through its abilities to consolidate servers, increases operational efficiencies and improves an organization’s ability to prepare for a disaster, bringing an unprecedented level of return on investment. However, these gains can be quickly eaten away by uncontrolled storage infrastructure expenses. While other concerns include performance and data protection, as discussed in the article “IOPs are More Important Than Air”, front and center for most organizations is curtailing capacity growth.
Friday, February 4, 2011
When discussing ways to minimize capacity growth elaborate technologies like compression and deduplication come up. While these technologies are important in their own right, they operate after the storage growth has occurred and thereby can be less effective in its control. The first step in any storage optimization strategy should be to eliminate capacity consumption before it actually takes place - when the virtual machine is created. This is the role of cloning, a new feature being offered by storage systems manufacturers.
What are Clones?
Cloning typically leverages a storage system’s snapshot technology. Snapshots are a very common feature in today’s storage systems that allow for users to create a point in time image of a volume, without actually creating a full, stand-alone copy. Snapshots set the blocks associated with that data to read only, then, as changes are made to the original, they are branched off and stored separately. This allows for an active image of that data to be maintained as well as the point in time copy. Clones allow for multiple branches off of the original read-only snapshot to be altered. Hence they are also known as writable snapshots.
The Value of Clones with Virtualized Servers
The value of clones in a server virtualization infrastructure is that master images of virtual machines (VM) or vmdk (virtual machine disk) files can be created from an original source and used for the creation of subsequent VMs. For example, a Windows VM can be configured with all the parameters that the virtual server administrator feels should be in each Windows implementation. As a new Windows VM is needed they would create a clone of that master with only the components that are unique to that server, added to that instance. If the administrator needs to create an Exchange environment they can add the various Exchange software components to the original Windows VM master and then create an Exchange VM master.
The advantage of cloning is that each new instance doesn’t require that all the data be copied from the master resulting in storage space efficiencies. The clone can share some of the existing data from the original virtual machine. Since virtual machines often have a lot of identical core data between them, hundreds can be deployed, sharing these common data objects and saving a significant amount of storage capacity. In the example described above, where a Windows master is created and used to create an Exchange Master (or whatever application), storage capacity savings can be even more impressive.
A second advantage to cloning is that the new VM can be created in seconds, since there is essentially no data being copied in the cloning process. All that happens is a new set of pointers to the original data set is created. This also means that the size of the virtual machine master is irrelevant since it will take as long to create pointers from a 500GB master as it would a 50GB master.
The result of cloning is that data growth can be limited at its source. There’s no need, initially, for deduplication or compression because no data is actually created, just pointers are established. Hundreds, if not thousands of virtual machines could theoretically be created instantly, without having to purchase additional capacity.
The Limitations Of Cloning
The single biggest limitation to cloning is that in almost all cases they are volume based, meaning that the entire volume must be cloned, not just certain virtual machines in that volume. This limitation leads to either ignoring virtual machine storage best practices or to orchestrating a very complex process of aligning golden masters with the right volumes.
In the first situation, one virtual machine per volume, each master is placed on its own volume. The problem with this is that for many storage systems, and even hypervisors, there is a practical limitation to how many volumes can be supported at any given time, causing a similar limitation on the number of VMs created. Additionally each volume has its level of overhead on the system as well as on the storage administrator, who now has to manage and track potentially hundreds of volumes.
The second situation, orchestrating the right masters on the right volumes, is an attempt at a work-around to the problems above. In this situation, master VMs are all grouped together so that fewer volumes are needed. The problem is that each subsequent clone will not need all the masters, again wasting overhead. But the real challenge is the burden placed on administrators trying to create this balance.
What this leads to is tying up specific disk resources to manage the cloning process and again, there is always the risk of running out of space for volumes. The net impact is that the challenges of cloning and managing those clones may not be worth the effort unless an intelligent alternative is developed.
Granular, Scalable Clones
The good news is that companies like BlueArc with their new JetClone Technology are able to eliminate most of the above challenges by creating clones at the VMDK (virtual machine disk) level. This means that a single volume can be used for all VM images and only the image files used for the masters need to be cloned. This greatly simplifies the overall storage management process, while gaining all the disk capacity savings. The only limitation is on the size of the file system itself, which in the case of BlueArc is 256TB.
Most cloning, and even snapshot technologies, place a limit on just how many clones or snapshots can be made. This level of granularity from which a clone can be made also extends its scalability, or how many offspring and generations of the clone can be maintained without impacting performance in a noticeable way. When it comes to server and desktop virtual environments the number of clones that can be created per file system becomes a key consideration as some solutions can quickly run out of available storage for clones or have file system constraints. JetClones, when combined with BlueArc’s vaulted performance and scalable file system, enables them to claim a maximum of 16 times more clones than the leading competitor.
Cloning is an ideal way to improve the efficient use of storage capacity, as well as improve the ability to respond to requests for new virtual machines. The lack of granularity of those clones, the lack of integration with the VMware platform and the inherent performance limitations that already plague traditional NAS solutions can make their actual deployment both complex and less effective in the virtualized environment. Solutions like BlueArc JetClone with its JetCenter integration solve many of the challenges that VMware administrators face when looking to use cloning.
George Crump, Senior Analyst
BlueArc is a client of Storage Switzerland
Related Articles
What is pNFS and why should you care?
Storage-at-Scale Systems Save Costs
Cost Effectively Scaling Storage Performance
Unstructured Data Growth ‘Storage at Scale’
Proficient Object-based Replication
Solve Boot Storm with High Performance NAS
Storage: Scale Up or Scale Out
Design NAS for Massive Scalability
DC Virtualization-IOPS Most Important
Related Blogs
File System Tiers Metadata for Performance
BlueArc uses Permabit’s Dedupe Engine