EMC tackles big data with Greenplum appliance

EMC has wasted no time packaging its Greenplum acquisition into a data warehouse appliance

Taking aim at the growing problem of big data management, EMC has released a data warehouse appliance tweaked to consume lots of data really quickly.

The Greenplum Data Computing Appliance takes advantage of an MPP (massively parallel processing) technology developed by Greenplum, a firm acquired by EMC in July.

EMC claims the appliance can ingest data twice as quickly as competing products, which EMC identified as Oracle Exadata, IBM Netezza and Teradata's enterprise data warehouse offering. A single rack can ingest 10 terabytes per hour, the company claims.

The appliance will be marketed to organizations trying to derive intelligence of large amounts of incoming data, said Scott Yara, who was the president of Greenplum, and is now vice president of products at EMC.

"Machines sitting on the network or on the Web are generating much more data than humans ever could. All the mobile phones, sensor networks and routers are pouring off millions of events each day," he said. In order to make sense of this input, "businesses are forced to create all this data analysis infrastructure that they never had to before."

To ingest data more quickly, Greenplum adopted a parallel processing architecture long used by the high performance computing community.

Most data warehouse appliances have a single master node through which all data must enter, Yura explained. This approach can be a bottleneck when trying to import large amounts of data quickly. In the MPP approach, each server on a rack gets a dedicated Ethernet connection.

"Instead of loading the data into one system and trying to distribute it, [the Greenplum architecture] loads the data in parallel to all the servers in a cluster," Yara said. In a peer-to-peer fashion, the servers coordinate amongst themselves to balance the data across all the nodes.

The MPP architecture also allows the data analysis to be executed in parallel across the servers. "You can break a single query up across the all the machines," Yura said.

The Greenplum Data Computing Appliance, available now, offers database software (Greenplum Database 4.0) preloaded on an integrated set of servers, along with storage and networking. A single rack would have 16 servers, each running two Intel E5670 hexacore processors. The appliance can be purchased as a half-rack, a single rack, or in a multiple rack configuration. Each rack could hold up to 36 terabytes of uncompressed storage, or up to 5 petabytes compressed across 24 racks. A 24-rack system could run a total of 4,608 database cores.

The appliance form-factor offers a number of advantages, the company claims. It can also be coupled with EMC's Data Domain backup and recovery software, which would allow the data warehouse material to be backed up to EMC SANs (storage area networks), as well as allow the appliance to use the SAN for additional storage.

Also, EMC's RecoverPoint software could, if needed, populate a second data warehouse with the data from the SAN.

"That next step will be a huge differentiator in my book," said Steve Hirsch, chief data officer and senior vice president of Global Data Services at New York Stock Exchange Euronext, at a launch event held in New York on Wednesday. He noted that today most organizations have to make a full second working instance of the data warehouse for backup purposes, the maintenance of which can require a lot of personnel and hardware resources.

Euronext has used the Greenplum software since 2007. The organization's internal operations generate about four terabytes of data each day, and the Greenplum database is used to derive performance metrics from some of this data.

"For us it is very expensive to move data around to analyze. We nee to load it once, analyze it there and make that data available," Hirsch said.

With this release, EMC also announced that it has created a new division, called the Data Computing Products Division, which will concentrate on data management software, such as that of Greenplum.

Tags applicationssoftwaredata miningbusiness intelligencedata warehousingemc

Recommended

Keep up with the latest tech news, reviews and previews by subscribing to the Good Gear Guide newsletter.

Joab Jackson

IDG News Service

Comments

Comments are now closed.

Most Popular Reviews

Follow Us

Best Deals on GoodGearGuide

Shopping.com

Latest News Articles

Resources

GGG Evaluation Team

Kathy Cassidy

STYLISTIC Q702

First impression on unpacking the Q702 test unit was the solid feel and clean, minimalist styling.

Anthony Grifoni

STYLISTIC Q572

For work use, Microsoft Word and Excel programs pre-installed on the device are adequate for preparing short documents.

Steph Mundell

LIFEBOOK UH574

The Fujitsu LifeBook UH574 allowed for great mobility without being obnoxiously heavy or clunky. Its twelve hours of battery life did not disappoint.

Andrew Mitsi

STYLISTIC Q702

The screen was particularly good. It is bright and visible from most angles, however heat is an issue, particularly around the Windows button on the front, and on the back where the battery housing is located.

Simon Harriott

STYLISTIC Q702

My first impression after unboxing the Q702 is that it is a nice looking unit. Styling is somewhat minimalist but very effective. The tablet part, once detached, has a nice weight, and no buttons or switches are located in awkward or intrusive positions.

Latest Jobs

Don’t have an account? Sign up here

Don't have an account? Sign up now

Forgot password?