For example, we never published a head-to-head test when looking at backup deduplication, because the variables were too out of control. What we learned in that round of backup deduplication lab work made its way into a report entitled "Buyers Beware - All Data deduplication is not created equal" published in October of 2007 when backup deduplication was front and center.


A primary goal, then, for this project was to keep the scope very tight, resisting the temptation to test everything and every vendor. This was deliberate and meant we could build a baseline by which we could do future testing. We kept it to two vendors, and we tested on a specific type of primary storage -- one that I believe is very common in the enterprise, we call it near-active.


Near-active data is data that resides on primary storage, is active but not live at that moment in time. It is the Word document you created yesterday, not the Word document you are working on right now. The document you are working on right now and databases and VMware images are what we refer to as the very active data set. Clearly this is a different data set and needs to be handled differently, possibly with different tools and we are in the process of devising a testing mechanism for that as well.


Other responses have been along the lines of asking why we did not test various other data sets which were not included in the report. Popular requests so far have centered around: performance, VMware images, Oracle Database and compliance capabilities like WORM. As I stated above, the goal was to keep the testing scope very tight so we could produce something of value and in a timely fashion.


Specifically on the performance request, While we did not specifically test for performance differences there did not seem to be any dramatic difference that we could see. Indeed, it would take a stop watch to measure the time differences between the two, indicating they are likely be closely matched.


The other criteria we were testing in this report was the ability to deduplicate that storage in place as well as the ability to move that optimized data to a different tier of storage. I know that certain vendors could work together to bring this functionality, but for this initial test we wanted to use products that were self-contained.


Now that the baseline is set we will be glad to test other vendors and combined solutions going forward. As I stated in the report this was a for-fee activity – our policy is that we don’t do such tests at our own expense. Any other vendors that would like to have their product tested against this baseline will have to agree to the same conditions that Ocarina did. That is, Storage Switzerland will have full control of the test. We use our own data set, and we report what we find regardless of whether the results are favorable or unfavorable to the company that has requested and funded the test. We will give you the courtesy of an advanced review.


The last point I would like to make is that the results of the lab test do not change my overall opinion of NetApp. As many people within NetApp will tell you I like their products and they should certainly be considered for a wide range of primary storage offerings. Further, I don't really have an issue with their dedupe functionality, it is after all free and works very well on active VMware images in addition to near-active data.


You as the customer, as always, have the ultimate vote. If the base deduplication functionality that comes with the NetApp appliance is adequate to meet your needs, then you may very well decide to look no further. It is safe, reliable and seemed to perform relatively well. If you need to extend NetApp's dedupe functionality, as any customer may need to do with any storage system, there are viable options; Ocarina Networks being one available to you. As we cited in the report, Ocarina extends the basic dedupe functionality with content aware compression, deduplication and the ability to move to a different tier of storage. This is the type of functionality that will serve a number of customers needs.