How To Judge Purpose Built Backup Appliance Performance
How To Judge Purpose Built Backup Appliance Performance
Purpose Built Backup Appliances (PBBA) are designed to improve the backup process by decreasing backup and recovery windows across multiple applications and operating platforms. Leveraging disk based technology they are less sensitive to network latency than tape and are far faster at locating data to be restored. Just how quickly these systems can ingest data is often the subject of debate between vendors and a source of confusion among users.
The problem is that most vendors or publications that look at PBBAs don’t really make a fair comparison. Vendors usually don’t take into account architectural differences between products that are integral to how performance is delivered. This makes getting to the point of an apples-to-apples comparison especially challenging with PBBA. They are architected in dramatically different ways and have different performance impacting features. Publications do not typically know how a vendor system should be configured for optimal performance. Or, they test systems of a different cost and capacity rating leading to a skewed version of performance achievement.
The Reality of PBBA Performance
Another PBBA challenge is that both the vendor tests and the publication tests often have little basis in reality. Since they’re not actually buying the equipment, the systems they test are frequently the largest or fastest configurations available. Most backup managers, whether they work for a large or small organization, don’t usually have the budget required to get this kind of maximum performance. They don’t care how fast the unit will go, only if it will be fast enough to solve their backup performance issues. In most cases they have a fixed budget and are going to try to get the best performance possible within the confines of that budget.
The fact that vendor A can deliver 10TBs an hour of performance in a system that costs $200K means nothing if there is only $50K in the backup upgrade budget for the customer trying to evaluate that system. A more realistic performance analysis would be to measure how much performance each PBBA can deliver for a given price point. Maybe a better question to ask would be “how many TBs per hour will the PBBA be able to provide while spending less than $50K?” ExaGrid, as an example, can configure a system that delivers 4.8 TB per hour of performance for that same $50K.
The other fact that’s not taken into account when PBBAs are compared is that performance is only one variable in the decision making process. The cost to add capacity and performance as well as how difficult it is to use the device on a day-to-day basis are equally important to users. Each of these figures into a customer’s buying decision. Scalability for example, is a key criteria for PBBAs. An infrastructure like that available from ExaGrid has been proven, through hundreds of customer testimonials, to incrementally scale performance without having to endure a fork-lift upgrade.
While performance is just one variable in the decision making process it is of course an important one. Most users will cite performance as a top motivation for considering a PBBA. Understanding how vendors get to the performance numbers they claim is important in understanding how the system will actually perform in their environment. The numbers are also an important guide because, while it’s always recommended that the test environment emulate the production environment, in reality that is seldom possible.
There are many factors that impact a PBBA’s ability to hit a performance number. Some that are tough to examine are the quality of the software engineering and how efficiently it ingests data. Clearly, some vendors are more advanced at this than others. Quality of code is something that’s hard to determine from a spec sheet.
There are three areas that need to be analyzed when considering a PBBA; ease of connectivity, deduplication and the architecture of the system. The first two can impact immediate performance and the third may impact future performance, as well as cost.
Architecture Matters
When comparing products, the architecture of the systems needs to be taken into account. There are two basic architectures that need to be understood, scale up and scale out. ExaGrid is an example of a scale out architecture, most of the other vendors in the Mid Range PBBA market use a scale up, single controller architecture. In a scale out architecture performance and capacity increases as nodes are added to the system. Essentially the scale out design’s worst performing day is its first and it becomes faster as more capacity is added to the system since each node also contains processing power and I/O.
It is important to understand how the scale out software balances that performance across the available nodes. Typically a fair amount of parallelism is needed to maximize scale out storage performance. The good news is that in backup there’s plenty of potential parallel data generators in the form of clients, which typically contain very similar information.
A scale up architecture is the opposite. Its best performing day is the first day it’s installed, assuming it was not bought completely full. As drives and drive shelves are added to the system the controller has to work harder as it communicates to more and more drives. It also typically has a limited amount of I/O bandwidth per controller.
Scale up architectures may have an advantage backing up low numbers of large clients where parallelism is not needed. But for the broader backup use case, they may struggle having dozens of clients write through their single set of controllers simultaneously.
Once the capacity or bandwidth limits of the controller of a scale up system is reached a second system must be added, which also adds administrative complexity. Alternatively, the current system must be replaced for one with a newer, more powerful set of controllers. In some cases, but not all, the disks can be moved to the new controllers. If this can’t be done, all the backup history is lost and the learning that makes deduplication so efficient must start over.
The biggest mistake in comparing a scale out architecture to the more common scale up architectures, typically occurs when someone compares a single scale out node or even a few nodes to a controller capable of handling significant disk expansion. For example a single scale out node may be only capable of ingesting 2.4TB per hour and is compared to a single controller scale up system achieving 5.4TB per hour. On the surface the scale up system seems faster but as nodes are added to the scale out storage system its performance scales, often in a near linear fashion. In this example at two nodes the scale out system will ingest 4.8TB per hour and at three it will ingest 7.2TB per hour, eclipsing the scale up, single controller system.
When evaluating the performance of a system end-users need to look at what their performance needs are right now and then decide how far those needs may grow in the near future. The cost to upgrade to the next level should be factored into the overall economics of the storage system.
Another way to evaluate PPBAs is to factor in the cost per TB of ingest rate. It is likely that the one node, two node configuration and potentially the three node configurations mentioned above will be less expensive than the cost of the scale up system. Combining both of these methods provides the most realistic price/performance analysis.
In theory scale up storage systems should be less expensive than scale out storage systems from day one. This is often because several scale out nodes need to be purchased upfront to meet the performance demands of the environment and to build the initial cluster. Theory does not always translate to reality though.
It is doubtful that a single controller PBBA will be bought to just barely meet today’s performance and capacity demands, but more than likely will be bought to meet the demands of the next few years. This means that extra expense is incurred upfront to put off a future upgrade. In addition, these PBBAs need to account for deduplication processing which adds to the initial expense as well. It comes as no surprise to IT professionals but is worth reminding that processing and bandwidth get less expensive over time, paying for something upfront that you know will be less expensive in a few months is not a good use of budget dollars.
In most cases the initial performance of the architecture, if designed correctly, should be about the same, with scale up being potentially a little better upfront to compensate for future performance needs. The divergence in architectures is what happens when a scale up system needs to be upgraded because it has reached its performance maximum. Whether it is a replacement or just an add-on scenario, the expense can be significant. A properly designed scale out architecture should never or certainly less often need a complete fork lift upgrade.
Custom Connectivity
The ease of connecting a PBBA comes from the almost universal choice of Ethernet as the physical connection. In the early stages of PBBA development this meant that the system had to be addressed via a network share point, essentially as easy to set up as a NAS. As long as the backup application was able to backup to a NAS device then chances of a quick and successful implementation was high.
Unfortunately standard file sharing protocols like NFS and CIFS are not optimized for backup traffic workloads and performance is sacrificed in favor of this simplicity. Recently Symantec began shipping its OpenStorage API which included the ability to produce a more optimized transmission across Ethernet networks. This is the configuration that most PBBA vendors use in their highest performing Ethernet test results. Using these numbers a part of the basis for a PBBA comparison is only valid if the backup application is actually Symantec NetBackup or Backup Exec.
If Symantec is not a key backup vendor for the organization then it makes more sense to compare the backup numbers with standard NFS and CIFS. These are often more chatty protocols and can potentially cause performance issues on single head appliances. Non-optimal data transmission is a real-world condition and testing how well the PBBA handles it is critical to understanding how well the PBBA will do in a user’s environment.
Deduplication
Deduplication is potentially the primary reason for the popularity of PBBAs. While not matching the dollar per GB of tape the technology certainly narrowed the gap. Deduplication works by segmenting data and comparing it to data already stored on the device. If redundant data is found at some point it is eliminated, either as it’s being received or at a later time, as part of a post-process comparison.
While there has been much debate over where and when deduplication should occur, much of the downsides to both techniques have been addressed by each vendor. Deduplication as data is being received can be offset by ‘super-sizing’ the RAM and processing power in the PBBA. Data that’s being deduplicated as a post process can be offset by providing additional storage capacity to compensate until the deduplication process is complete.
In most cases when and where deduplication is done should now be performance neutral. However there is a cost associated to that neutrality and it’s reasonable to assume that the cost of additional storage to compensate for the landing area is less expensive than the processing power and RAM required to check data inflight.
It is important to note that backup deduplication is different than primary storage deduplication. PBBAs always have the specter of beating the cost of tape, so pricing is important. Most primary storage controllers already have processing power to spare and are compensating for data redundancy in the form of snapshots and thin provisioning. This is not the case with a PBBA and the cost to compensate for inline deduplication performance should be factored in.
Summary
The key to a successful long term PBBA investment is to overlay the performance needed on the available budget. In most cases the budget will dictate how much performance you will get more so than the performance you want. Finally, compare the performance available at that budget level to expected growth in the environment over the next 3-5 years. Make sure that the system purchased can either scale to meet that growth or that there is budget set aside at some mid-point to upgrade the system.
Judging PBBA performance can be a confusing task but it doesn’t have to be if focus is placed on the needs of the organization. As stated above, narrow the criteria by understanding exactly what performance is needed in your environment and what is available given your budget, not to what the fastest system on earth is.
ExaGrid is a client of Storage Switzerland
Previous Entry: “Cloud as a Tier or Embedded Cloud for File Services”
Wednesday, November 30, 2011
George Crump, Senior Analyst