TPC takes the measure of big data systems

The fresh TPCx-HS benchmark could provide an apples-to-apples comparison of commercial Hadoop systems

Comparing commercial Hadoop big data-styled analysis systems might get a little easier, thanks to a new benchmark from the Transaction Processing Performance Council (TPC).

The TPCx-HS benchmark, posted Monday, offers a performance assessment of Hadoop-based systems.

"There has been a lot of push from our customers for a standard to objectively measure performance and price performance of big data systems," said Raghunath Nambiar, who is the chairman of the TPCx-HS committee, as well as a distinguished engineer at Cisco.

The worldwide IT market for big data-styled analysis should swell to over US$240 billion by 2016, according to IDC, and companies such as IBM and Hewlett-Packard are offering prepackaged systems running Hadoop, currently the most popular of the big data systems now being tested and used within the enterprise.

Today vendors may offer performance metrics of their Hadoop systems, though each company uses its own benchmark, making it difficult for customers to compare systems.

TPC hopes that Hadoop system vendors will run its benchmark against their own systems, allowing potential customers to directly compare the price performance across different offerings.

TPCx-HS "defines a level playing field. The number you get from vendor X can be fairly compared to the number from vendor Y," Nambiar said.

A benchmark kit, which can be downloaded from the TPC site, tests overall performance of a Hadoop system. It includes the specification and user documentation, as well as scripts to run the benchmark code and a Java program to execute the benchmark load.

The benchmark itself measures how quickly an Apache Hadoop system organizes data using the widely used terasort sorting algorithm. Vendors can tune their systems either by optimizing the software through various means, or by running the fastest hardware available.

Using the benchmark, a tester can choose one of a number of different-sized machine-generated data sets, ranging now from a single terabyte to 10,000 terabytes.

The benchmark provides a score for overall performance, as well as a price-performance score to specify how much performance the system offers per the cost of the system. A third optional test measures the energy efficiency of the system.

The test must be conducted twice, according to the TPC rules, and the slowest run of the two is the official benchmark speed. Published TPC results can be challenged by other parties within 60 days.

Like with all of its benchmarks, TPC requires that all the official testing be done by a third party. With the big data benchmark, this can be done through an independent auditor or, more informally, a peer audit, which probably would not be as costly.

Founded in 1988, the TPC is a nonprofit corporation that provides vendor-neutral benchmarks for testing the performance of transaction processing and database systems.

Although the organization started with the intent of producing benchmarks for transactional database systems, it has, in recent years, been expanding out to covering other computational systems as well. In 2012, it published a benchmark for virtualization software.

Companies such as Dell, Cisco, IBM, Hewlett-Packard, Oracle, Unisys, Intel and Microsoft are members of the TPC, as well as Hadoop software vendors Cloudera, Pivotal and Red Hat.

Joab Jackson covers enterprise software and general technology breaking news for The IDG News Service. Follow Joab on Twitter at @Joab_Jackson. Joab's e-mail address is Joab_Jackson@idg.com

Join the Good Gear Guide newsletter!

Error: Please check your email address.

Tags applicationsdata miningsoftwareTransaction Performance Councildata warehousing

Our Back to Business guide highlights the best products for you to boost your productivity at home, on the road, at the office, or in the classroom.

Keep up with the latest tech news, reviews and previews by subscribing to the Good Gear Guide newsletter.

Joab Jackson

IDG News Service
Show Comments

Most Popular Reviews

Latest News Articles

Resources

PCW Evaluation Team

Azadeh Williams

HP OfficeJet Pro 8730

A smarter way to print for busy small business owners, combining speedy printing with scanning and copying, making it easier to produce high quality documents and images at a touch of a button.

Andrew Grant

HP OfficeJet Pro 8730

I've had a multifunction printer in the office going on 10 years now. It was a neat bit of kit back in the day -- print, copy, scan, fax -- when printing over WiFi felt a bit like magic. It’s seen better days though and an upgrade’s well overdue. This HP OfficeJet Pro 8730 looks like it ticks all the same boxes: print, copy, scan, and fax. (Really? Does anyone fax anything any more? I guess it's good to know the facility’s there, just in case.) Printing over WiFi is more-or- less standard these days.

Ed Dawson

HP OfficeJet Pro 8730

As a freelance writer who is always on the go, I like my technology to be both efficient and effective so I can do my job well. The HP OfficeJet Pro 8730 Inkjet Printer ticks all the boxes in terms of form factor, performance and user interface.

Michael Hargreaves

Windows 10 for Business / Dell XPS 13

I’d happily recommend this touchscreen laptop and Windows 10 as a great way to get serious work done at a desk or on the road.

Aysha Strobbe

Windows 10 / HP Spectre x360

Ultimately, I think the Windows 10 environment is excellent for me as it caters for so many different uses. The inclusion of the Xbox app is also great for when you need some downtime too!

Mark Escubio

Windows 10 / Lenovo Yoga 910

For me, the Xbox Play Anywhere is a great new feature as it allows you to play your current Xbox games with higher resolutions and better graphics without forking out extra cash for another copy. Although available titles are still scarce, but I’m sure it will grow in time.

Featured Content

Latest Jobs

Don’t have an account? Sign up here

Don't have an account? Sign up now

Forgot password?