Cloud NAS as a First Step


The purpose of the cloud NAS gateway that many storage providers offer is that it provides an important translation between data center network storage protocols (CIFS/NFS) and a more internet-optimized protocol that is less chatty. The advantage of the cloud NAS gateway is that it is a viable quick fix to get data moving to the cloud and supports those ‘one-off’ types of data movements that can be accounted for by an application.


A cloud NAS is an ideal solution for an end customer that’s looking to start using the cloud for archiving, as discussed in Storage Switzerland's Using Cloud Archive white paper. The advantage of a cloud NAS is that it enables a customer to begin archiving data to the cloud almost as easily as they could send data to a local network mounted disk. If data can be pointed at a share it can be stored in the cloud. This allows for rapid adoption of cloud storage as an archive destination.


The challenge with a cloud NAS gateway is that this simple, general purpose deployment comes at the expense of control, customization and application functionality. This generic approach makes it more difficult for ISVs to support the specific capabilities of cloud storage that their software product needs, or that their market requires. This has led cloud storage providers to begin to offer an enhanced option to ISVs through the use of an API set.



The Cloud Storage API


What does a cloud storage API do for an ISV? First and most importantly, it allows for native communication between the software application and the cloud storage provider. This eliminates the need for the ISV's application to depend on the appliance for data movement. In addition, it allows the ISV greater control over how and where data is stored, and even may save them time in the development process, by not having to re-invent a wheel that is already available. Finally API sets, like Iron Mountain’s Archive Services Platform API, can provide the application with greater functionality beyond just simple cloud storage.


An integrated API set also provides the end user with a more simplified means of interacting with the secondary storage tier and greater control over how that data is stored or retrieved. This is because the interaction with the data is from within the application which should eliminate external procedures to data management. In conjunction with this should be a more consistent adherence to storage and retention policies, because those policies can be set when the data is live instead of going back and trying to classify data years after it’s created.


Interacting and performing efficient communication to the cloud is just the bare minimum of what an API should do for the ISV. If all the API does is provide an additional storage point, it has little more value than just supporting the latest disk-to-tape device. Cloud Storage APIs, like those provided by Iron Mountain Archive Services Platform API, do much more. For example their Archive Services Platform API uses either SOAP or RESTful Web services to expose its interfaces which include ingestion, search, retrieval, retention, and destruction capabilities.


APIs like this can add capabilities to an ISV's application without much additional development effort. These are critical advantages for the ISV because advanced Cloud Storage API sets can speed time to market with new features, save development investment and reduce the time to actually develop these capabilities. Finally, they enable the ISV to keep their customers happy in a very competitive market place.



API-enabled Capabilities


Content Search Capability - Virtually, and from the programing language, the API set should provide the ability to deliver context searching on the data that is being stored on it. Once the API command is given the processing required to chew through the data and create the index should all be done on the back end, by the cloud storage system. Through the API the ISV can send content to the cloud storage service. The content is then indexed (optionally) during ingestion, and subsequent commands allow the ISV to query the index and retrieve results.


These queries can be against the core content itself, or they can be metadata specified during ingestion. The query command returns a set of results, or results can be requested a “page” at a time. This is more applicable for applications that let the user browse through very large search results sets. The results returned from a query can also include (some) content. This is ideal for the ISV wanting to display the results set to the user without having to issue subsequent retrieval requests to paint the search results screen.


Optimized Retrieval - Advanced API sets should also give the ability to provide commands to the cloud storage system to store the data in a parsed format that allows the organization to then retrieve just the components of the data that is needed. It is vital to be able to have these items individually addressable as a workaround for the latency of the cloud.


For example if the ISV is archiving email, they may want to command the API to parse the data by header, body and attachments. Then, when a search is performed, only the header and or body, which are essentially text, need to be pulled across the internet connection. This is helpful because often a range of emails need to be searched to find the exact message desired. Then when that one email is found, the attachments can be brought across the internet with it, if needed.


Retention and Compliance Controls - The API set should also enable the ISV to programmatically set retention and WORM levels within the application itself. The ISV could have these options set by an administrator or by the user, from the GUI or automatically, based on an internal analysis of the data. Retention commands should include the ability to set length of time to be stored, number of copies to be kept and if the data can be modified. Finally, the API set should allow for the assured destruction of data, confirming that the data is no longer recoverable.


Metadata Tagging - As part of the movement to the cloud storage device, the API set should also allow metadata tagging. Metadata tagging allows for keywords or reference points to be set on files. This metadata can be established to help with search criteria, retention criteria or compliance criteria.


As stated earlier, being able to build search optimization, pre-parsing data for retrieval, setting retention, compliance and metadata upfront at the point of archive goes a long way toward those feature sets actually being used. Traditionally classification projects are very large and involve the processing of data long after it has been written and the specifics of that data have been forgotten. A feature-rich API set allows you to build classification into the process, enabling an interactive archive to become an upfront, quick, simple task that is performed consistently. Classifying up front also is likely to increase the accuracy of the classification as the application usually has the most context at that point and is likely to do the best job of classification (and could even interact with a user at that point).



The Company Behind the API


Any relationship is about more than just the technology, especially one where there is a development effort and associated cost to support the API set. This ongoing development effort and cost is even more in the spotlight when the data that will be hosted in that environment is designed or required to be stored for years. When it comes to cloud storage, the ISV should look for a partner, not just a technology.


First, ISVs should consider the track record and brand value of the storage provider they choose to work with. Will the API partner provide additional confidence for the ISV's customer because of the API and cloud storage provider’s track record and company longevity? Iron Mountain for example, has been delivering cloud storage solutions for over 10 years and has been the leader in securely storing information since 1951.


Second, ISVs should look at how securely their cloud storage partner will store their customers’ data. The first step is to make sure the API data can be encrypted during transmission and remain encrypted when stored at the cloud storage provider’s facility. Not only does this prevent the infiltration of data in flight, it also insures that only that customer can read their data.


The ISV should also understand that security is more than just encryption, which is essentially software protection. The ISV should look at the hard assets of the cloud storage provider they choose to partner with and be confident that they have invested in has in redundant power, connectivity and even data centers. Their partner should include the ability to store multiple copies of that data not only in geographically disperse locations, but also locally. This provides the ISV with the comfort of recovery from a failure or data corruption locally as well as globally.


Finally, each of the physical storage facilities needs to have high levels of secure access.. Iron Mountain as an example, goes so far as to house its facilities in underground bunkers with their own security teams and fire departments. They have been providing these capabilities for decades.




Cloud Storage APIs, like Iron Mountain's, provide ISVs with the ability to leverage the credentials of their partners to help attract new customers, as well as expand the value of their solutions to existing customers. The ISV needs to look for a model that allows them to increase customer satisfaction by solving customer problems, while at the same time potentially providing an additional recurring revenue stream for the ISV.


Cloud storage can no longer be ignored by ISVs. The Importance of a Cloud Storage API is that it allows the ISV to answer the demand from customers to provide alternative ways to store and manage unprecedented volumes of data without incurring significant OPEX or CAPEX costs.

 

George Crump, Senior Analyst