The ‘Retrofit’ SSD Architecture

Without appropriate design investment the performance benefits of solid state storage may be wasted. This is one of the reasons that performance of legacy storage systems that have simply added solid state devices are so often disappointing to users. While performance does improve it doesn’t improve to the level that it should, as Storage Switzerland details in the article "SSD in Legacy Storage Systems”. The reason for this poor result is that the solid state drives (SSDs) are being installed in systems that were architected around relatively slow and latency-prone mechanical hard drive technology. They’re treated essentially as fast disk drives, not as zero-latency, memory based storage devices. When an SSD is placed in these systems, the bandwidth of the connection from the storage system HBA to the drive shelf and/or the storage controller internal processor can quickly become a bottleneck, impeding solid state performance. The value of these attempts by legacy storage manufacturers is that they confirm the need for an architecture that was designed specifically for solid state storage, not one that was retrofitted for it.

The PCIe SSD Architecture

The easy approach that seems to be popular with many solid state storage manufacturers is to leverage the PCIe bus in some manner. This is commonly done through either the use of a PCIe card in the server that has flash memory installed. From a performance perspective this can reduce latency slightly, since the processor has direct access to the storage through PCIe channels. The biggest benefit is that the high performance storage fits entirely within the server. As a result the PCIe SSD is ideal to address ‘point’ storage performance challenges for specific application servers.

For PCIe based systems to fully deliver their impressive performance, though, the application to be accelerated must be isolated to a single server. If the performance problem can be solved by a scale-up design (meaning a single, large, or finite number of servers with internal storage) then PCI SSD is ideal. If the performance demand needs to be solved by clustering a group of compute servers together, or if the capacity requirements of the application exceed a couple of PCIe SSD cards, then an external approach is needed. Conversely, the application that’s having performance issues may have only a small capacity requirement, one that even an entry level PCIe card is too large for. This can result in unused capacity, which means wasted money on a solid state storage device even more than in the mechanical hard disk world. Ideally, you'd like to carve that capacity up and distribute it across multiple servers, but this can't be done with a PCIe card locked inside a single server.

However, standalone scale-up server architectures also have challenges. First, the large single systems that have enough processing power and I/O capabilities to meet performance demands have to be paid for, essentially upfront, and usually in pairs for high availability. Secondly, a single server may not provide enough performance, no matter how powerful it is. As stated above this issue has led many organizations to implement scale-out compute architectures that leverage some sort of clustering mechanism to make multiple servers act as one, or at least perform different operations on the same data set in parallel. As a result many applications are no longer confined to a single server; instead, they are increasingly spread across dozens of servers, all of which are likely to need access to the same data set at the same time. In other words, the high performance storage area needs to be shared.

While retrofit SSD architectures provide this shared access they do so with the above mentioned performance limitations. And, while PCIe architectures eliminate many of those limitations they can’t be shared. Storage managers are looking for solid state storage architectures designed for shareability without any degradation in performance.

Gateway Architectures

Some PCIe systems will achieve this sharing by connecting their solid state memory system to a "gateway" type of appliance. These systems then manage the provisioning and sharing of the solid state storage. While this off the shelf approach has some appeal, it also causes performance issues as most of these appliances are essentially full blown servers with storage virtualization software added to them. The latency caused by a full operating system, plus the storage virtualization software, can reduce overall solid state performance and make it less attractive to the storage manager who’s trying to solve a performance problem. Some vendors have tried to load up the storage virtualization appliance with PCIe SSD cards, but this design may have internal performance challenges as well as external performance challenges dealing with the connection to the servers. The server itself may not have the internal crossbar or switching architecture to move all this data in parallel to the connecting hosts.

Enterprise Architectures

Enterprise infrastructure designs created for efficient resource sharing incorporate standard networking interfaces that already exist in the data center, such as 8 Gb Fibre Channel (FC). PCIe is not a “network”. The problem is that implementing standard 8 Gb FC into a solid state storage appliance has its own set of challenges.

In theory, installing multiple 8 Gb Fibre cards is relatively straightforward and could add enough bandwidth to support the maximum performance of the solid state devices involved. The challenge is making sure that all the memory inside that appliance can get to all the available I/O ports. This feat requires parallelism so that all the 8 Gb interfaces have access to all the memory modules. Accomplishing this requires a sophisticated crossbar architecture, like the type that Texas Memory Systems uses in their solid state storage appliances.

Fully utilizing a crossbar architecture usually means that off the shelf 8 Gb FC cards cannot be used since they were not designed for parallel access. In short, the whole system needs to be designed from the start for high performance, massively parallel access. Companies like Texas Memory Systems custom design their 8 Gb interfaces to access and fully exploit this crossbar architecture.


Maximizing performance is critical to realizing a full return on your premium investment in solid state technology. It’s easy for some vendors to claim that performance is good enough and that this extra architectural work is wasted, but that’s simply not accurate. The better the performance, the higher the ROI can be. A higher performing system can either extend performance of an application further than ever thought, which can lead to quicker results, or the theoretical extra performance can be leveraged across more applications than originally expected. Either way, this flexibility allows maximum value to be extracted from your solid state investment.

Texas Memory Systems is a client of Storage Switzerland

George Crump, Senior Analyst

Enhancing Server and Desktop Virtualization with SSD Series

  Part I - Cost Justifying SSD
  Part II - Integrating SSD into a Virtual Server or DT Infrastructure

 Related Content
 Will MLC SSD Replace SLC?
 Using SSS with High Bandwidth Applications
 Solid State Storage for Bandwidth Applications
 Texas Memory Announces 8Gb FC
 SSD in Legacy Storage Systems
 Driving Down Storage Complexity with SSD
 SSD is the New Green
 SSD or Automated Tiering?
 Selecting Which SSD to Use Part III - Budget
 Selecting an SSD - Part Two
 Selecting which SSD to Use - Part One
 Pay Attention to Flash Controllers
 SSD Domination on Target
 Integrating SSD & Maintaining DR
 Visualizing SSD Readiness
Screen Casts
 Access our SSD Screen Cast