Automated tiering key to getting value from SSDs
- — 30 October, 2009 11:43
Though flash storage could be the most powerful tool yet for IT administrators who want to speed up access to frequently used data, reaping its benefits may require automation software that has just begun to emerge from the major storage vendors.
Flash storage devices such as SSDs (solid-state disks) can bring up a given bit of data faster than HDDs (hard disk drives) because they can get to it without spinning a disk. Though much faster than HDDs at reading, they tend to offer less advantage in writing data, and all this comes at a much greater per-bit cost. So solid state is not expected to replace spinning disks, but to sit alongside them and handle only certain kinds of data.
There are two main ways to use flash, depending on an organization's needs. Both take up less space and energy than other arrangements for fast storage, such as spreading a relatively small amount of data across several HDDs to shorten access times.
Inserted directly in a server, flash can form a second tier of data cache below DRAM, an arrangement that will automatically hold on to the most accessed bits until use of those bits declines and they go down to disk storage. In the form of SSDs, flash becomes the top layer of the permanent storage system.
SSDs are worth it if they are dedicated to data that's read frequently, such as information in a database or popular multimedia content. But what's most often used and what's currently popular can change over time, and not all the data in a particular LUN (logical unit number) may qualify. So storage vendors are working on how to find the most active, or "hot," data and move it into the flash tier of the data center.
IBM officials last week disclosed some details of that company's plans in the area. It's developing a system called Automatic Data Relocation, which can identify the more active parts of a LUN and move them to flash storage, while assigning other data to HDDs. Enterprise policies will also come into play in those decisions. The software should become available for IBM's DS8000 storage array in the first half of next year and for its Storage Virtualization Center platform in the second half, according to Chris Saul, IBM's marketing manager for storage virtualization.
EMC is developing its own automated tiering system, FAST (Fully Automated Storage Tiering), which will go on sale later this year and gain sub-LUN parsing capability in the middle of next year. Compellent Technologies was a pioneer in automating data movement and already has a system with sub-LUN capability.
Enterprises have three main options for using flash, according to analyst Andrew Reichman of Forrester Research. Using it as cache is best suited to enterprises that depend on very fast performance, such as stock exchanges, he said. Basic prioritization is automatic, and most storage vendors also offer tools for "pinning" certain types of data to flash so it doesn't automatically get purged from the cache.
Other organizations may install SSDs for their storage arrays and seek out or write better analytical tools to identify the most appropriate files to put there, Reichman said. However, not many IT departments have the spare resources to do it on their own, he said.
"Storage environments are overworked and understaffed as it is, and the likelihood of adding more things to do is slim," he said.
One enterprise that believes it has the smarts to take that path is MySpace. The company is currently using flash as cache, because speed is of the essence on the media-oriented social-networking service, according to Richard Buckingham, vice president of technical operations at MySpace. But it might expand that to SSDs in the future. Buckingham said MySpace has the technical expertise in house to develop its own automated tiering software, while few other organizations do.
With a tool such as FAST or Automatic Data Relocation, most of that effort is saved.
"The front-runner of those choices seems to be automated data movement," Forrester's Reichman said. It's crucial because it will let the average enterprise achieve the efficiency and savings they are after with flash, he said. In fact, he believes the wait for more automation software is holding back demand for flash storage.
"Auto tiering is the way to go," said analyst Henry Baltazar of The 451 Group. "It's too much of a pain to micromanage all those different workloads." He believes applications themselves will eventually do much of the work of assigning data to different tiers, but first there need to be standards across the industry so they can talk to different vendors' infrastructure, he said. The Storage Networking Industry Association is one organization working toward such standards. For now, it's up to individual vendors, he said.
In any automatic system, sub-LUN capability will be crucial, the analysts said. Because SSDs are much more expensive than HDDs, per bit, it makes no sense to store an entire database on flash just because some of it is frequently accessed.
"Putting a whole LUN on an SSD means that you really have the likelihood of wasting it," Reichman said. "It still is a 10X cost differential."