Python gets a big data boost from DARPA

Continuum Analytics will extend the widely used NumPy library for distributed systems

DARPA (the U.S. Defense Advanced Research Projects Agency) has awarded $3 million to software provider Continuum Analytics to help fund the development of Python's data processing and visualization capabilities for big data jobs.

The money will go toward developing new techniques for data analysis and for visually portraying large, multi-dimensional data sets. The work aims to extend beyond the capabilities offered by the NumPy and SciPy Python libraries, which are widely used by programmers for mathematical and scientific calculations, respectively.

More mathematically centered languages such as the R Statistical language might seem better suited for big-data number crunching, but Python offers an advantage of being easy to learn.

"Python is a very easy language to learn for non-programmers," said Peter Wang, president of Continuum Analytics. That's important because most big-data analysts will probably not be programmers. If they can learn an easy language, they won't have to rely on an external software development group to complete their analysis, Wang said.

The work is part of DARPA's XData research program, a four-year, $100 million effort to give the Defense Department and other U.S. government agencies tools to work with large amounts of sensor data and other forms of big data.

For the XData project, DARPA awarded funding to about two dozen companies, including the University of Southern California, Stanford University and Lawrence Berkeley National Laboratory. The organizations are encouraged to use each other's technologies to further extend what can be done in big data, Wang said.

DARPA encouraged the funding recipients to release products based on their work and to release their code as open source, so the innovations can be widely used and supported outside of the military. The Defense Department is trying to avoid commissioning software that gets used only by the military, which may then become prohibitively time-consuming and expensive to update.

"With big data systems, you find new things you want to look at every week. You can't wait for that process any more," Wang said.

Headquartered in Austin, Texas, Continuum Analytics offers add-on products and services that help organizations use Python for data analysis. The company will use the DARPA money to continue development of a number of add-on technologies it has been working on, including Blaze, Numba and Bokeh, all of which provide advanced features not offered in Python itself.

At the PyData 2012 conference in New York last November, Continuum engineer Stephen Diehl discussed how Blaze would operate, describing the library as a potential successor to NumPy.

NumPy has limitations that Blaze seeks to correct, Diehl said. Most notably, NumPy only offers the ability to store a series of numbers as one continuous string of data. "It is a single buffer, a continuous block of memory. That may be OK for some uses, but the real world is more heterogenous," he said in a presentation.

Blaze can "endow [data] with structure," Diehl said. It will also allow programmers to establish multidimensional arrays and store these arrays in a distributed architecture, across multiple machines.

Bokeh is a Python library that can visually render large data sets using the HTML 5 Canvas tag, while Numba is a Python compiler that recognizes NumPy calls. Numba is included in Continuum's flagship product, Anaconda, a Python distribution with a number of premium data analysis features.

Joab Jackson covers enterprise software and general technology breaking news for The IDG News Service. Follow Joab on Twitter at @Joab_Jackson. Joab's e-mail address is Joab_Jackson@idg.com

Tags Languages and standardsapplication developmentapplicationsDefense Advanced Research Projects AgencysoftwareContinuum AnalyticsData management

Keep up with the latest tech news, reviews and previews by subscribing to the Good Gear Guide newsletter.

Joab Jackson

IDG News Service

Comments

Comments are now closed.

Latest News Articles

Most Popular Articles

Follow Us

GGG Evaluation Team

Kathy Cassidy

STYLISTIC Q702

First impression on unpacking the Q702 test unit was the solid feel and clean, minimalist styling.

Anthony Grifoni

STYLISTIC Q572

For work use, Microsoft Word and Excel programs pre-installed on the device are adequate for preparing short documents.

Steph Mundell

LIFEBOOK UH574

The Fujitsu LifeBook UH574 allowed for great mobility without being obnoxiously heavy or clunky. Its twelve hours of battery life did not disappoint.

Andrew Mitsi

STYLISTIC Q702

The screen was particularly good. It is bright and visible from most angles, however heat is an issue, particularly around the Windows button on the front, and on the back where the battery housing is located.

Simon Harriott

STYLISTIC Q702

My first impression after unboxing the Q702 is that it is a nice looking unit. Styling is somewhat minimalist but very effective. The tablet part, once detached, has a nice weight, and no buttons or switches are located in awkward or intrusive positions.

Resources

Best Deals on GoodGearGuide

Latest Jobs

Don’t have an account? Sign up here

Don't have an account? Sign up now

Forgot password?