Maximizing Your Storage Cost Cutting Efforts
Maximizing Your Storage Cost Cutting Efforts
The problem with the ’archive first and ask questions later’ is that no analysis was done to determine what data is where and what data can be relocated somewhere else in the system. Tools exist today from companies like Tek-Tools‘ that provide independent third party analysis of the storage environment and how applications interact with that environment. In short to get maximum benefit when driving out costs you need to first drive in efficiency through a proper analysis of the environment.
Inventory of Data Set
The first step in maximizing your storage cost cutting efforts is to understand what those storage assets are and what they contain. This will not only include a simple file view of data that has not been accessed and can be moved to secondary storage but also provide an analysis of what that data is and what type of optimization should be applied to it during movement.
For example is that data an ideal candidate for some form of compression or deduplication? If so which should be used? There are solutions that are focused on providing optimization of standard files like VMware images and then there are those that specialize in optimizing .jpg’s and PDF files. An analysis tool will indicate what type of data is in the archival candidate set. Most data reduction technologies cost about the same, the ROI that an analysis tool brings to this step is selecting the technique that is going to provide the maximum reduction for your investment.
A third and critical parameter is how fast will data be added to the archive and how fast is new data coming in. Again with a proper analysis tool those factors can be examined and even trended for future growth. The ROI here is knowing how large of an archive to invest in upfront and to properly budget for not only archive but also primary storage investments in the future.
Mapping the Remains
Cutting storage costs can be as simple as eliminating shelves of unused capacity. The challenge is that in most data centers this capacity is not represented by racks of storage shelves in an array that have no data on them. More likely it is available capacity that is scattered throughout the environment but none of it in a contiguous block and certainly not on one contiguous shelf.
The next step in maximizing storage cost cutting efforts is to understand the data sets that are left on primary storage, that 20% may be small but it is active, likely critical and as stated earlier scattered all over the available disk shelves. Ideally the goal is to consolidate this data on to a smaller amount of storage platforms and a smaller amount of disk shelves being careful to balance cost savings vs. data vulnerability. A data analysis tool like Tek-Tools will provide a complete inventory of where the remaining data set actually is. From there the storage administrator can use the tools often provided by the array vendor to consolidate LUNs and reduce the number of shelves required which eventually will result in turning off the shelfs, possibly some storage controllers and lead to dramatic hard dollar savings in power consumption and the soft dollar savings of reduced management overhead.
The ability to consolidate this data is going to require not only knowing that there is excess capacity but also where to place the actual data for maximum utilization.
Inventory of Performance
During the process of LUN consolidation it is also valuable to understand what performance the storage systems can deliver and what performance the applications need. This represents another ROI point for analysis, by making sure that applications are on performance correct and thereby cost correct storage. An application that can’t or does not need to drive data at the full speed of the storage architecture will gain nothing from being on that storage architecture.
Most storage arrays provide similar if not identical data protection like snapshots and replication across all supported performance tiers. The remaining differentiator is the performance those tiers provide and the cost to acquire and power those tiers. For example faster drives that are smaller in capacity require more physical units, often consume more power on a per unit basis and as a result will require more shelves for the same capacity as larger capacity, slower tiers. Shelves again cost money to acquire and to power.
A tool that can analyze both the performance aspects of the storage and the performance requirements of the application will allow the storage administer to correctly match workloads with performance capabilities and optimize the cost reduction.
This step goes hand in hand with the consolidation of LUNs because relocating data onto fewer storage shelves is going to also require that the performance capabilities of the storage that is receiving the relocated data but also the impact on that performance when this new workload is moved.
Maximum Resource Utilization Requires Constant Monitoring
Manual analysis of just the inventory components of this process could be overwhelming to the IT staff, especially one whose levels have more than likely been stretched thinner than ever. Automating this process is a key requirement.
Once the commitment is made to use every IT resource to its safe maximum, the environment needs to be actively monitored and trended. Even if the time and resources were available to manually inventory and analyze the above parameters, which is unlikely since it would take weeks if not months, in a data center where all the components are being used to their full potential without excess, one spike in utilization needs to be quickly identified or alerted to (even predicted) so adjustments to the environment can be made.
Don’t Forget about Server Virtualization
Of course as part of any cost savings effort the subject of server virtualization or expanding the server virtualization effort comes up. The tools used to analyze and provide real time monitoring of the environment as described above can also provide real world analysis of servers that are ideal candidates for virtualization. For example Tek-Tools uses real world statistics and then maps those into a simulator so the performance impact both to the virtual host and the newly virtualized application can now be measured. For more details on this see our article on Maximizing your Virtualization Project.
In the current economic situation excess capacity, performance or compute power or even archive areas are all expenses that cannot be tolerated. Most likely, even as the economy turns around, data centers will not be allowed to return to wasting budget dollars on excess. The optimized IT environment is a permanent condition, using realtime analysis tools not only make that an achievable goal, they do so while increasing the overall efficiency of the IT staff, returning to them the most valuable asset of all time. Time to do their job and time to go home on time.
Wednesday, January 21, 2009
In 2009 an obvious theme is cutting costs. In IT and storage in particular that cost cutting will be popularized by data archiving and data reduction vendors. Both are very legitimate steps to take.
In the typical data center a good rule of thumb is that 80% of the data on primary storage can be migrated to an archive platform. The user buys the archive solution of their choice, then migrates all this data, but where is the ROI? Clearly there is now acres of free disk space on the primary storage platforms and some ROI can be derived if this process delays the purchase of additional storage, but the storage administrator can not start turning off empty drives. There aren’t any. The way most storage arrays work, the remaining data is still scattered across all the drives in each LUN.