There are two common clustering methods typically used in storage clusters. The first is what is known as a “managed cluster”, typified by companies like Dell EqualLogic. These solutions have the advantage of being able to deploy initially on fewer nodes but tend to have scaling issues sooner. This is because there is a single control unit that all data must pass thorough. When a request comes into the controller from a server, that controller has to route the request to where the data resides on the cluster. At some point that node can become a bottleneck. It also means storage controllers have to be deployed in pairs for availability.


The second form of clustering, something that Scale calls “TrueClustering”, allows for all the components of the cluster to be accessible to the server that is requesting data. There is no single point of bottleneck in this design. In Scale's world, a cluster is a collection of 1U servers called “nodes”. Each of these nodes contains four hard drives, one processor and two GBE ports.


There are three available node configurations, and refreshingly, Scale reports its configurations in usable capacity. The SN1000 has 1TB usable, the SN2000 has 2TBs usable and the SN4000 as 4TB's usable. Very importantly, there is the ability to mix nodes of differing capacities in the same cluster with access to each node’s full capacity. This is a critical capability and should not be overlooked. The per-drive capacity of hard disks will continue to grow, and the storage cluster should be able to scale accordingly. The maximum supported capacity of the Scale TrueCluster today is 2,200 TBs or 2.2PBs.


To create the cluster there is a minimum requirement of three nodes, a private switch for node-to-node communication and a connection to a public switch for servers to access the storage. To scale a storage cluster, simply plug in another node, connect it to the switches and assign it to the cluster -- capacity, performance and bandwidth increase in a linear fashion. There is also no data migration step as there is when more traditional storage arrays or even additional capacity on those arrays are purchased. In the Scale solution, data is automatically rebalanced across the cluster as each node is added.

George Crump, Senior Analyst

Briefing Report

If one node or a drive in a node fails, the initiator is automatically and seamlessly directed to another node. As data is stored on the cluster it is segmented into blocks and those blocks are then striped across the current node and mirrored to another node. As a result, there is no single point of failure. If a drive or node fails, data is automatically rebalanced across the cluster to ‘re-protect’ itself.


By using a RAID 10 type of data protection implementation Scale avoids the increasingly cumbersome issues related to RAID 5 or RAID 6 rebuilds. New nodes or drives are repopulated fast, returning the cluster to maximum protection more quickly than traditional storage arrays.


The cost of a 3TB system is less than $12,000, a 6TB system is less than $15,000 and the 12TB system is less than $21,000. At these price points these systems are ideal solutions for server virtualization environments where shared storage brings significant value to the virtualization investment as well as an archive disk store to move less frequently accessed data from primary storage to a scalable secondary storage repository.