Thin Aware File Systems vs. Thin Aware Storage Systems
Thin Aware File Systems vs. Thin Aware Storage Systems
Thin provisioning is quickly becoming a required feature when looking for a new storage system. Ideally, a thin storage environment should only consume the amount of physical storage strictly required to support the application data on the hosts and nothing more. The challenge is that most file systems are not thin aware. As a result, thin volumes don't 'stay thin' over time, and in many cases, never even start out thin because of how the initial migration from legacy storage to thin storage is performed. Thin aware file systems can address the major challenges that thin storage environments face, allowing users to get the full benefits of thin provisioning in terms of operation simplification and storage cost savings.
Wednesday, March 31, 2010
The ideal world for a thinly provisioned volume is one where the volume is brand new and only new data is going to be placed on it. In other words, data is not going to be migrated to it from another volume on a different storage system. Also ideal for a thinly provisioned volume are data growth and deletion that are relatively slow and continuous; in other words, no sudden ingest of TBs of data nor any mass deletion of data. Unfortunately, ideal is not reality and at times in the real world, thinly provisioned volumes become 'chunky' and end up not being as thin as they could be. While they are still more space efficient than legacy thick volumes it would be better if the thin volume could more effectively address the issues surrounding migration and reclamation.
The problem stems from the fact that all thin storage by default follows the “write once, allocate forever” principle. When data is written by a file system on blocks of a thin LUN, the storage system allocates pages of physical storage to support that data. But when that same data is deleted by a traditional file system, the file system just updates its own meta data. As a result, the data is deleted from the file system but it remains on disk and it continues to consume physical storage (this is essentially what enables you to ‘undelete’ something in Windows). The next time the file system needs to write data, it may decide to place that data on a brand new set of blocks on the thin LUN and trigger additional allocation of pages of physical storage. In fact some files systems, NTFS being a good example, are particularly lazy about reusing the space of deleted areas. They instead will progressively write across the full range of the file system, thereby causing the thin LUN to essentially become thick.
All file systems keep track of used and unused space. While traditional file systems are only able to internally manage used and unused space, thin aware file systems are able to automatically coordinate with the underlying thin storage array and ensure that the unused space in the file system does not consume any physical space in the array. This requires some amount of integration between the host file system and the thin storage array so the host can efficiently and non disruptively inform the storage array of the location of all the unused space in the file system. Veritas Storage Foundation by Symantec is the first host storage management solution to deliver this level of integration with thin storage arrays delivering a solution that can automatically ensure that a thin storage system is optimized and only consumes the amount of physical storage required to support actual application data.
Migration from an old system to a new system is another good example of how beneficial thin-aware file systems can be. Most migrations from one SAN array to another are essentially block-level copies from the old volume to the new volume. Block-based copies are fast and don't require as much local server processing power, an ideal method to quickly migrate data to a new system. This also means that ALL the blocks of data are copied, including blocks that are not supporting any application data. By writing on every block of a thin LUN, these storage solutions invariably cause the new thin volume to become 'fat' (remember ‘write once, allocate forever’). When a thin aware file system is used to migrate data from a traditional thick volume to a thin volume, only the blocks that the file system knows to be supporting actual application data are copied to the new storage system. The new thin volume stays thin. This is exactly what Veritas Storage Foundation SmartMove does for online thick to thin migrations.
Storage hardware vendors that offer thin provisioning are aware of this challenge. Many have no way to address it, but a few are working on solutions. Most of these involve something called "zero detect technology". Basically, these systems will have you "zero out" data segments marked for deletion by using common utilities. They will then scan and find the zeroed-out segments and reclaim that space for the global storage pool. While this technique addresses the challenges of getting a volume thin upon migration as well as keeping the volume thin as it is used, there are a few caveats.
First, most thin aware storage systems require a separate and, at this point, manual process by the storage administrator to zero-out the blocks marked for deletion. Storage managers really don't need another task added to their list of things to do.
Second, zeroing out deleted space is something that will need to be scheduled and applied carefully since it must be executed from the server to which the file system or volume is mounted. This means that the server processor and disk I/O will have to allocate some resources to performing this task. In a large SAN this could mean manual interactions with potentially hundreds of servers. Also, depending on the server, the disk I/O and the amount of deletions to be zeroed out could take a significant period of time.
Third, once the zeroed-out task has been completed, the storage system must scan the volume to find these segments. This of course will consume processing power and I/O bandwidth on the storage controller. While some vendors have created dedicated ASICs to address this potential storage resource shortfall, most have not. Additionally, the zeroed out segments have to be consecutively grouped so that there is enough space to actually reclaim, as some vendors have a very small segment size with which they operate.
These steps are basically the same whether there is a migration action or a reclamation action. Migration may be a little worse, since much more data is coming to the new system at much faster transfer speeds.
While ‘zero detect technology’ can be used to reclaim unused space, there is little argument that thin aware file systems which can automatically coordinate with thin storage arrays provide a more efficient solution to the problem. The specific SCSI protocol enhancements required for this communication are currently making their way through ratification in T10. But for now the file system vendors need to work with the specific hardware vendors to achieve compatibility. Veritas Storage Foundation by Symantec implements a standard based thin aware solution that has been certified with EMC Clariion, HDS's USP-V, USP-VM and AMS, IBM XIV, HP’s XP20K and 24K, 3PAR and NetApp. Storage Foundation can be used to provide thin aware host storage management on Solaris, AIX, HPUX, Linux and Windows.
While the zero-detect process used by storage hardware manufacturers does get the job done, it is a brute-force method to accomplishing that task. Comparatively, the thin aware file system approach off-loads this work from the storage processor and, in the case of thin reclamation, happens in near real-time without any manual intervention by the storage administrator.
George Crump, Senior Analyst
This Article Sponsored by Symantec