Hadoop gets native R programming for big data analysis

Revolution R Enterprise has released a plug-in for running R analytics on Hadoopo data sets

Sensing a growing interest in big data-style analysis, software provider Revolution Analytics has updated its flagship package of R statistical functions so it can be run with the Hadoop data processing platform.

Revolution R Enterprise 7 (RRE 7), to be made available on Monday, also features the ability to run R within Teradata databases as well.

The R language provides a way to run common statistical tests -- such as linear and nonlinear modelling, time-series analysis, classification, and clustering -- on a set of data, often portraying the results in graphical form.

R is becoming increasingly popular for sophisticated data analysis that goes beyond what can be offered by more standard business intelligence (BI) packages. Revolution Analytics has estimated that over 2 million people use R worldwide.

RRE7 includes a library of R algorithms that can be run in parallel across multiple nodes, which is how Hadoop manages large data sets. RRE 7 can be added to the Cloudera CDH3 and CDH4 Hadoop distributions as well as Hortonworks Data Platform 1.3.

The new R library includes the most commonly used statistical and predictive analytics algorithms for tasks such as data processing, data sampling, descriptive statistics, statistical tests, data visualization, simulation, machine learning and predictive models.

By analyzing the data within the node in which it resides, rather than moving it somewhere else to be analyzed, R-based data analysis can done more quickly, according to Revolution Analytics. It also allows an entire set of data to be analyzed, rather than a subset or summary of the data, which is the approach typically taken with enterprise data warehouses (EDWs).

Revolution Analytics hopes the incorporation of R within Hadoop and the Teradata databases will also broaden the use of the language to line-of-business managers. The company has designed a new workflow interface that does not require knowledge of how to implement specific R algorithms. This eliminates the hassle of coding R with Java, or some other language, in order to have it run on the Hadoop platform.

In addition to supporting these new platforms, RRE7 also features a number of new algorithms and processes. One is a collection of models for setting up Decision Forests, a machine learning technique for predicting future outcomes. A new batch of Stepwise Regression functionalities can help automate the process of selecting the most important variables to be used in a predictive model. A new Decision Tree visualization can provide a graphical way for depicting complex relationships and correlations within a set of data.

Joab Jackson covers enterprise software and general technology breaking news for The IDG News Service. Follow Joab on Twitter at @Joab_Jackson. Joab's e-mail address is Joab_Jackson@idg.com

Join the newsletter!

Error: Please check your email address.
Rocket to Success - Your 10 Tips for Smarter ERP System Selection

Tags Revolution Analyticsapplicationsdata miningsoftware

Keep up with the latest tech news, reviews and previews by subscribing to the Good Gear Guide newsletter.

Joab Jackson

IDG News Service
Show Comments

Most Popular Reviews

Latest Articles

Resources

PCW Evaluation Team

Sarah Ieroianni

Brother QL-820NWB Professional Label Printer

The print quality also does not disappoint, it’s clear, bold, doesn’t smudge and the text is perfectly sized.

Ratchada Dunn

Sharp PN-40TC1 Huddle Board

The Huddle Board’s built in program; Sharp Touch Viewing software allows us to easily manipulate and edit our documents (jpegs and PDFs) all at the same time on the dashboard.

George Khoury

Sharp PN-40TC1 Huddle Board

The biggest perks for me would be that it comes with easy to use and comprehensive programs that make the collaboration process a whole lot more intuitive and organic

David Coyle

Brother PocketJet PJ-773 A4 Portable Thermal Printer

I rate the printer as a 5 out of 5 stars as it has been able to fit seamlessly into my busy and mobile lifestyle.

Kurt Hegetschweiler

Brother PocketJet PJ-773 A4 Portable Thermal Printer

It’s perfect for mobile workers. Just take it out — it’s small enough to sit anywhere — turn it on, load a sheet of paper, and start printing.

Matthew Stivala

HP OfficeJet 250 Mobile Printer

The HP OfficeJet 250 Mobile Printer is a great device that fits perfectly into my fast paced and mobile lifestyle. My first impression of the printer itself was how incredibly compact and sleek the device was.

Featured Content

Latest Jobs

Don’t have an account? Sign up here

Don't have an account? Sign up now

Forgot password?