Bringing some enterprise rigor to the wild world of big data, Hewlett-Packard has issued a package that will allow organizations to harness HP's Vertica analytical database engine to investigate reams of unstructured data residing in Hadoop systems.
"We're marrying the best of both worlds," said Jeff Healey, director of product marketing for HP's big data platform products. "We're bringing a proven SQL engine into the Hadoop infrastructure. You can view and explore Hadoop data without learning a whole bunch of new skills."
The HP Vertica for SQL on Hadoop package can work with the Hortonworks, MapR and Cloudera Hadoop distributions and with basic Apache Hadoop systems that may have been developed in house.
HP had found that about half of its Vertica customer base were also interested in using Hadoop, Healey said. In effect, this package will allow customers to use Vertica as a front end for analyzing data stored in Hadoop deployments.
A column-oriented database system, HP Vertica Analytics Platform was designed to quickly execute large-scale analysis jobs. It can field queries written in the standard SQL (Structured Query Language) recognizable by most database administrators and third-party business intelligence tools.
The Vertica engine "is very reliable and can provide the stability that enterprise customers are looking for today," said Ignacio Hwang, HP senior product manager.
Part of the package is HP's Flex Zone, which can be used to explore unstructured data with SQL without first going through the work of applying schemas to the information. Customers could also use the package to import selected data from Hadoop for faster analysis within Vertica itself.
HP will face plenty of competition in providing SQL for Hadoop. Other software that addresses this need includes the open-source Apache Hive, Cloudera's Impala (also open source), IBM's BigSQL and Pivotal's Hawq, among others.
One early user of the technology has been the Snagajob employment service, which uses the software to provide a way for its million daily Web visitors to sort through the 400,000 jobs listings that the company routinely keeps posted.
The HP Vertica Analytics Platform is available now. The company did not divulge specific pricing other than to say it would be priced on a per-node basis.