Hadoop gets native R programming for big data analysis

Revolution R Enterprise has released a plug-in for running R analytics on Hadoopo data sets

Sensing a growing interest in big data-style analysis, software provider Revolution Analytics has updated its flagship package of R statistical functions so it can be run with the Hadoop data processing platform.

Revolution R Enterprise 7 (RRE 7), to be made available on Monday, also features the ability to run R within Teradata databases as well.

The R language provides a way to run common statistical tests -- such as linear and nonlinear modelling, time-series analysis, classification, and clustering -- on a set of data, often portraying the results in graphical form.

R is becoming increasingly popular for sophisticated data analysis that goes beyond what can be offered by more standard business intelligence (BI) packages. Revolution Analytics has estimated that over 2 million people use R worldwide.

RRE7 includes a library of R algorithms that can be run in parallel across multiple nodes, which is how Hadoop manages large data sets. RRE 7 can be added to the Cloudera CDH3 and CDH4 Hadoop distributions as well as Hortonworks Data Platform 1.3.

The new R library includes the most commonly used statistical and predictive analytics algorithms for tasks such as data processing, data sampling, descriptive statistics, statistical tests, data visualization, simulation, machine learning and predictive models.

By analyzing the data within the node in which it resides, rather than moving it somewhere else to be analyzed, R-based data analysis can done more quickly, according to Revolution Analytics. It also allows an entire set of data to be analyzed, rather than a subset or summary of the data, which is the approach typically taken with enterprise data warehouses (EDWs).

Revolution Analytics hopes the incorporation of R within Hadoop and the Teradata databases will also broaden the use of the language to line-of-business managers. The company has designed a new workflow interface that does not require knowledge of how to implement specific R algorithms. This eliminates the hassle of coding R with Java, or some other language, in order to have it run on the Hadoop platform.

In addition to supporting these new platforms, RRE7 also features a number of new algorithms and processes. One is a collection of models for setting up Decision Forests, a machine learning technique for predicting future outcomes. A new batch of Stepwise Regression functionalities can help automate the process of selecting the most important variables to be used in a predictive model. A new Decision Tree visualization can provide a graphical way for depicting complex relationships and correlations within a set of data.

Joab Jackson covers enterprise software and general technology breaking news for The IDG News Service. Follow Joab on Twitter at @Joab_Jackson. Joab's e-mail address is Joab_Jackson@idg.com

Join the Good Gear Guide newsletter!

Error: Please check your email address.

Tags Revolution Analyticsapplicationsdata miningsoftware

Our Back to Business guide highlights the best products for you to boost your productivity at home, on the road, at the office, or in the classroom.

Keep up with the latest tech news, reviews and previews by subscribing to the Good Gear Guide newsletter.

Joab Jackson

IDG News Service
Show Comments

Essentials

Lexar® JumpDrive® S57 USB 3.0 flash drive

Learn more >

Microsoft L5V-00027 Sculpt Ergonomic Keyboard Desktop

Learn more >

Mobile

Lexar® JumpDrive® S45 USB 3.0 flash drive 

Learn more >

Exec

Audio-Technica ATH-ANC70 Noise Cancelling Headphones

Learn more >

Lexar® Professional 1800x microSDHC™/microSDXC™ UHS-II cards 

Learn more >

Lexar® JumpDrive® C20c USB Type-C flash drive 

Learn more >

HD Pan/Tilt Wi-Fi Camera with Night Vision NC450

Learn more >

Budget

Back To Business Guide

Click for more ›

Most Popular Reviews

Latest News Articles

Resources

PCW Evaluation Team

Azadeh Williams

HP OfficeJet Pro 8730

A smarter way to print for busy small business owners, combining speedy printing with scanning and copying, making it easier to produce high quality documents and images at a touch of a button.

Andrew Grant

HP OfficeJet Pro 8730

I've had a multifunction printer in the office going on 10 years now. It was a neat bit of kit back in the day -- print, copy, scan, fax -- when printing over WiFi felt a bit like magic. It’s seen better days though and an upgrade’s well overdue. This HP OfficeJet Pro 8730 looks like it ticks all the same boxes: print, copy, scan, and fax. (Really? Does anyone fax anything any more? I guess it's good to know the facility’s there, just in case.) Printing over WiFi is more-or- less standard these days.

Ed Dawson

HP OfficeJet Pro 8730

As a freelance writer who is always on the go, I like my technology to be both efficient and effective so I can do my job well. The HP OfficeJet Pro 8730 Inkjet Printer ticks all the boxes in terms of form factor, performance and user interface.

Michael Hargreaves

Windows 10 for Business / Dell XPS 13

I’d happily recommend this touchscreen laptop and Windows 10 as a great way to get serious work done at a desk or on the road.

Aysha Strobbe

Windows 10 / HP Spectre x360

Ultimately, I think the Windows 10 environment is excellent for me as it caters for so many different uses. The inclusion of the Xbox app is also great for when you need some downtime too!

Mark Escubio

Windows 10 / Lenovo Yoga 910

For me, the Xbox Play Anywhere is a great new feature as it allows you to play your current Xbox games with higher resolutions and better graphics without forking out extra cash for another copy. Although available titles are still scarce, but I’m sure it will grow in time.

Featured Content

Latest Jobs

Don’t have an account? Sign up here

Don't have an account? Sign up now

Forgot password?