Legacy Storage in the Modern Data Center
Legacy Storage in the Modern Data Center
Storage is in the midst of a transition. The legacy model serviced the fixed and somewhat static data centers of the 1990s where excess capacity, bandwidth and other resources abounded. In the new dynamic data center model servers only consume the resources that they need at the present moment and storage has to scale rapidly as the demands placed on it increase. Predictability is gone and planning beyond the next few months is almost impossible. The modern data center must be able to respond rapidly to the needs of the organization and the storage infrastructure must evolve to meet the challenges of this new reality.
Wednesday, August 18, 2010
In this series Storage Switzerland will explore the challenges with the legacy (current) storage environment. In the second part of the series we will explore the modern storage architectures and how they will address the challenges the dynamic data center faces and will be facing going forward.
What is Legacy Storage?
The legacy storage system was typically a dual-controller storage array or single NAS head designed to service a database-heavy environment with some file sharing. While classified as legacy it can also still be considered the default storage system choice. When legacy storage was new, processors were expensive and storage software wasn’t scalable. At the time it was wiser and more economical to use a single, fast processor than it was to use multiple slower ones. The same case held true for storage controllers and NAS heads. It was simpler to design software to support a single controller or NAS head with a redundant backup than it was to design a system that used multiple controllers simultaneously. Even when “active-active” NAS architectures came to market, storage I/O was still hard-set to a particular controller or NAS head. More advanced systems could allow data requests to be sent to one or the other NAS head, but the environment remained single-threaded.
This dual controller design was acceptable for the era in which it was created. In the mid-90s, as shared storage slowly became more commonplace, the environment was largely static. Each server had one application on it, each had its own connection to shared storage and each server had its own section of that shared storage. Changes to a single system could be made with little concern over ramifications to the other connected systems. While users created some of their own file data, most of the data was structured in a database. File based data was a small percentage of the overall data assets. The concept of thousands of users accessing data via a web front end was unheard of. As a result growth was predictable. The term “shared storage” was really a misnomer, everything was segregated from everything else. The goal was to make sure that servers could not see other servers’ data.
Legacy storage systems that were used in this fixed, monolithic application and storage environment were complex to configure, monitor and change. However, because of the fixed nature of the environment, once the initial implementation was done there were seldom changes to it. While those changes, when they did occur, required a herculean effort to implement, they were infrequent enough that the tasks could be completed. When legacy environments needed to be upgraded it was relatively simple to just add another separate storage system. Nothing was really shared anyway. The process of managing two separate systems was not much more complex than the complexity of managing a single system.
While incremental changes to the monolithic systems could be reasonably dealt with thanks to their singularity, this all changed when either more capacity or more performance was needed and a system upgrade was required. To expand a legacy storage system required either upgrading the current system to a new one or adding another entirely separate system. These were both expensive paths, upfront and at the point of upgrade. This also led to additional complexity since multiple storage system upgrades need to be planned for, while multiple systems added additional management.
The legacy storage architecture of monolithic systems is a poor match for the data center reality of today. Storage systems need to evolve to handle the “distributed everything” world of today.
The Modern Storage Environment
In the modern, virtualized and collaborative data center everything needs to see everything. Changes happen minute by minute, sometimes automatically, and not driven by humans. As a consequence, the storage environment needs to be flexible enough to adapt to those changes without a major upgrade or outage. This is being driven by scale-out application processing, server virtualization and document collaboration, as well as the overall growth in file systems. The rapid pace of change, compounded by the realities of cost containment, means that maintaining simplicity in the storage system becomes a critical feature.
One of the most efficient means of simplification is to reduce the amount of variables being managed to one. In this case that variable is the storage system. To develop a single system that can still scale to meet the capacity and performance needs of the data center requires a shift in architectures to one that matches the rest of the data center. A scale-out model is well suited to meet this challenge.
Scale-out models are now dominant in the data center. Application processing has long since mastered the art of leveraging multiple processors and now extends across multiple servers. Data in the high performance computing space will have dozens or more servers processing the same data simultaneously. This data is not always contained in a database, it’s often a series of files that must be processed rapidly to determine a result. Server virtualization, to be truly effective, requires that VMs sometimes be moved between physical hosts to increase availability or balance resource utilization. These examples have the inverse storage demands to the demands that legacy storage was designed to address. Instead of creating walls between servers the goal in scale-out application architectures or the virtual server environment is to have all servers in the same open space, each able to access each other’s virtual machines as needed.
Today’s knowledge workers are being asked to produce more with less just as IT is being asked to do more with less. This means leveraging multiple people to get a specific job done and it means work can no longer be done sequentially. As a result collaboration between users is at an all-time high. Most software will allow simultaneous users and is able to track changes to the data as it happens. The expectation of how that document will look is higher than ever which means that there is a need to embed more than just text into today’s documents. Images, graphics, sound and video are all now commonplace. This leads to highly active files that are significantly larger in size.
Many organizations and storage vendors are coming to the realization that the preferred storage technique for all this data is the use of file systems instead of block-based storage. In the ‘fixed’ world described earlier, block storage was ideal where processing resources were at a premium. In this dynamic data center what’s needed is flexibility and file systems that are easier to expand, organize and protect. The extra processing that it takes to manage a file system can often be handled by storage CPU resources that are available. File systems can not only make the storage manager’s job easier but also make the application developer’s job easier. If they can off-load the storage management responsibilities to the file system the developer can concentrate on the application itself.
The use case for file systems goes beyond these new examples. Owners of traditional fixed block data sets, like databases and server images, are finding that shared file systems offer excellent performance, while bringing a vastly simpler environment to manage.
The Challenges To The Modern Storage Architecture
The modern data center is an unpredictable, ‘shared everything’ world where multiple users, applications and servers need simultaneous access to the same data sets, something that legacy storage with its monolithic, single server - single application environment was never designed for. This dynamic data center – where everything is distributed from applications to users and data – needs a different storage approach. The answer seems to be in the file systems themselves, which already managed a shared multi-access environment. The challenge is that the typical solution, a standard file system, is still using legacy storage architectures and so it has the same problem these architectures present, namely, that the demands of file-based traffic overwhelm the systems’ performance and/or capacity capabilities.
The lack of planning time and the inability to be truly accurate in that planning, as well as the dynamic nature of the modern data center, means that changes and upgrades occur almost continuously. Expansion has to happen almost instantly. A modern scale-out system that hosts all this file system data needs to be as flexible and dynamic as the environment it supports. This is the strength of the modern scale-out storage architecture. These systems are essentially single entities that can be expanded in an incremental, cost-effective fashion, to meet the capacity and I/O needs of the environment.
In Storage Switzerland's next article in this series, "What is Scale-Out Storage and What is Not Scale-Out Storage", we will further detail scale-out storage and how to be on the lookout for systems that are trying to act like scale-out storage, but don’t deliver.
George Crump, Senior Analyst
Isilon Systems is a client of Storage Switzerland
Related Articles
Solve Corporate IT Challenges with Big Data
Storage Efficiency Is Key For Big Data
Designing Big Data Storage Infrastructures
Mitigating Risk With Scale-Out Storage
VMware Storage Simplification Strategies
The Complexity of VMware Storage Mgmt.
Searching for High Performance Storage
Server Virtualization in Bottlenecking NAS Storage
Solving the Storage I/O Performance Bottleneck
What’s Causing the Storage I/O Bottleneck?
Using NFS for Server Virtualization
Related Webcast
Simplifying Storage For Virtualized Environments