Solving the Data Protection Puzzle

 

Data protection historically has had some conflicting directives that you, as an IT professional, have had to juggle. On the one hand, you need to capture incremental changes as frequently and speedily as possible so that if you need to restore, you will have the most recent data, suffer minimal data loss and be able to bring the application back online as rapidly as possible. On the other hand, increasing demands on data retention are requiring that not only must you keep data much longer than before, you must also be able to find and identify information within that data more rapidly.


The Over-Protection Problem

These opposing forces have naturally created the need for multiple solutions. Backups, mirrors, replication, snapshots and now Continuous Data Protection are all strategies that you can deploy to protect your data assets. Most data centers have more than one of these strategies in place – and some have them all. Yet, despite all these efforts, you still don't have absolute, complete confidence in your ability to recover data. Why? There are too many players in the game, and nobody is sure who is protecting what. In addition, all of these protection processes are typically separate from each other; many involve multiple vendors, all requiring separate training and monitoring. All this protection costs too much money, especially if you are still not 100% sure that you can recover the data when you need to.


This abundance of point solutions has stemmed from the fact that most of the traditional suppliers have expanded their data protection offerings via acquisition or OEM agreements. Some vendors promise integration into their current solution, but most never deliver on it. Therefore, users end up with a hodgepodge of incompatible, non-integrated solutions causing them to get completely entangled in their own safety nets. What's needed is for developers to step up, do the work themselves and integrate all the different parts of the data protection puzzle.


Continuous or Near-Continuous Data Protection

This year we have seen the emergence of Continuous Data Protection (CDP): the ability to capture changes to your data in real time or near-real time. This is similar to most replication products, but unlike replication, as changes are made to the source, those changes are captured on the target, giving you the ability to dial back in time to a specific point to recover data. Like backup, this provides you with the ability to find a version of your file that is old enough to have not been exposed to the corruption or whatever caused the need for the recovery. Unlike backup, this ability to dial back in time is typically limited to a few days.


CDP has an advantage over backup in that it is more granular than backup -- meaning the points of protection happen more frequently. Clearly, you could run your backup process over the same data set repeatedly, but the performance impact on the servers, applications and CPU would be unacceptably large. CDP works because the impact on the server being backed up is minimized. Only changed blocks of information are transferred to the target. As a result, the amount of time it takes to move that data and the amount of data being sent are both kept to a minimum, meaning that there is minimal performance impact on the source server.


What's the difference between CDP vs. Near-CDP?

The primary difference between CDP and Near-CDP is the level of granularity which the protection captures. Most CDP solutions are capturing changes as they occur, essentially acting as asynchronous mirrors. This typically raises concerns about the quality of the data being sent to the target. Since the servers being protected are often live databases or email servers, you would prefer to put them in a backup mode to insure a clean snapshot of the data. There is often an uneasiness about performance impact as well. Because of these concerns, many customers will inactivate the constant protection feature, instead choosing specific points at which to perform the protection task.  This allows them to snapshot the server to make sure they have an environment that is “in sync” and to determine when the server is not too busy doing other tasks. At that point, they have made their CDP solution into a near-CDP solution. Near-CDP involves scheduling the protection event, similar to backup, but occurring far more frequently (such as every hour, or as required, as opposed to once per night or week).


However, using CDP the, ability to recover data quickly is available but not necessarily enabled. Because of the lack of integration, the recovery processes require manual interaction with the file system, there is no “point and click” walk through of the recovery process; You can however, leverage the ability of most CDP snapshots to roll back to a specific point in time, and then move that saved version of the data back over to the source. Again this is still a manually copy back process there is no guided walk through. With some solutions you can even map directly to the target disk and be back up and running without having to actually move data. The data can then be reverse replicated to the original server while you are back in production on your secondary copy. One solution includes a boot disc for a bare metal recovery of your downed server. This can restore the server image from the target to the source server.


Storage Area Networks (SANs) add another wrinkle to this picture.  Unless all your data (including boot data) is on the SAN, I believe that this type of protection should NOT be SAN-array based. This is because in most data centers, a majority of servers do not boot from the SAN, they boot off a local drive. And in that case, changes to those systems will not be captured with SAN-based snapshots or CDP. If this is the case in your data center, this strengthens the value of being able to do a bare metal recovery from your CDP target. We have seen an accommodation to overcoming this challenge in that many array manufacturers have acquired third party replication technology to capture this local data – and in most cases this technology has not been fully integrated into the array technology.


SAN based CDP and snapshots depend on the SAN array staying up and running -- if your SAN array has a major hardware failure, you are down and all your data protection is gone as well. . Having this protection directed at a different target altogether gives you greater redundancy and resiliency. Also as stated before, they do not capture any data that may reside locally on the attaching servers.


CDP's Missing Link

A major issue with CDP is the amount of storage needed.  A copy of the original data set, plus all the updates and changes needs to be stored on disk. And typically, the more disk storage you have, the greater the range of points in time you can access for recovery. Even with the lower cost of SATA disk drives, this is an expensive proposition for anything more than a few days. In addition, I don't recommend that you use the absolute cheapest arrays you can get your hands on; you want to be able to count on it for recoveries and have confidence that it will hold up  and perform well, especially if you are going to use it to boot from. The cost of getting reliable, well-performing disk arrays is going to add up and therefore you typically will not want to use the space to retain this information for weeks and weeks more likely for just a few days.


So, at some point you need to get a copy of this data moved to an archival device, either disk or tape. For most data, anything that is more than a few days old has lost its value, to the point that it is not cost-effective to store it on a standard array.  Movement to some sort of nearline device makes more sense. This is where most CDP and near-CDP solutions fall short. Most CDP solutions are not set up for an integrated move from the CDP storage to archive storage.


Data Protection Over-Protection, Part II

How do you move your data to long-term archive? The most common solution is to continue to run your backup process on the same servers on which you are doing CDP. The problem with this is that you are not leveraging the benefit of the CDP solution at all. With CDP, you have already moved all the data across the network to a single repository. Why move all that data again? In many cases, this backup process will still be directed at a disk target (not the same as the CDP) causing yet another disk expenditure, before being moved to an archive. In addition, you have the hard costs of two sets of software licenses, most likely two sets of agents for your applications (an Exchange module for example) and two maintenance contracts with different companies. At this point, the cost effectiveness of CDP begins to unravel and so the deployment becomes limited to just a few servers, rather than every server as was probably originally intended.


Another option is to just back up the CDP store with your backup application. This at least gives the advantage of moving the data to the archive without re-running it across the network. The problem with this is that you lose the original relevance and pointers from where the data was actually stored.  Without these markers, finding and recovering long-term data very difficult and in many cases requires two steps.


The ideal solution to this is to integrate the backup and CDP or near-CDP processes into a single task. In fact, it is really not integration of two tasks, it is simply consolidation of tasks so that one task -- Data Protection -- is controlled by one interface. This interface should be able to direct data near-continuously to a near-CDP target and then, at defined intervals, move the data as a backup to an archive target. Having this type of solution in place, gives you the benefits including:


1.) The ability to protect your changing data at very frequent intervals


2.) The ability to quickly recover that protected data from the backup via either a GUI or manual process


3.) The ability to bring your application back online quickly via an iSCSI map for near-instant availability and business continuity


4.)  The ability to perform a server image recovery


5.)  The ability to keep the storage needs of the CDP disk target to a minimum by frequent, easy transfer to archival medium


6.) The ability to use this same GUI to recover older data from the archive as well as the short-term storage



While this may seem a simple, practical and obvious solution, only a few suppliers offer anything along these lines.   I also don't expect this to be a common offering any time soon. Too many suppliers have taken the easy way out by purchasing, or OEMing, a product to try to fulfill the rest of the equation. It will take them far too long to integrate (if they do) this type of complete solution. The few suppliers that have perfected this solution are in a unique situation to solve a real customer problem and make many IT professionals very happy in the process.

 

Monday, October 29, 2007

 
 
Made on a Mac

next >

< previous