NSA's Accumulo NoSQL store offers role-based data access

Unlike other NoSQL data stores, Accumulo provides role-based access to data

With its much-discussed enthusiasm for collecting large amounts of data, the NSA naturally found much interest in the idea of highly scalable NoSQL databases.

But the U.S. intelligence agency needed some security of its own, so it developed a NoSQL data store called Accumulo, with built-in policy enforcement mechanisms that strictly limit who can see its data.

At the O'Reilly Strata-Hadoop World conference this week in New York, one of the former National Security Agency developers behind the software, Adam Fuchs, explained how Accumulo works and how it could be used in fields other than intelligence gathering. The agency contributed the software's source code to the Apache Software Foundation in 2011.

"Every single application that we built at the NSA has some concept of multilevel security," said Fuchs, who is now the chief technology officer of Sqrrl, which offers a commercial edition of the software.

The NSA started building Accumulo in 2008. Much like Facebook did with its Cassandra database around the same time, the NSA used the Google Big Table architecture as a starting point.

In the parlance of NoSQL databases, Accumulo is a simple key/value data store, built on a shared-nothing architecture that allows for easy expansion to thousands of nodes able to hold petabytes worth of data. It features a flexible schema that allows new columns to be quickly added, and comes with some advanced data analysis features as well.

Accumulo's killer feature, however, is its "data-centric security," Fuchs said. When data is entered into Accumulo, it must be accompanied with tags specifying who is allowed to see that material. Each row of data has a cell specifying the roles within an organization that can access the data, which can map back to specific organizational security policies.

It adheres to the RBAC (role-based access control) model. This approach allowed the NSA to categorize data into its multiple levels of classification -- confidential, secret, top secret -- as well as who in an organization could access the data, based on their official role within the organization. The database is accompanied by a policy engine that decides who can see what data.

This model could be used anywhere that security is an issue. For instance, if used in a health care organization, Accumulo can specify that only a patient and the patient's doctor can see the patient's data. The patient's specific doctor may change over time, but the role of the doctor, rather than the individual doctor, is specified in the database.

The NSA found that the data-centric approach "greatly simplifies application development," Fuchs said.

Because data today tends to be transformed and reused for different analysis applications, it makes sense for the database itself to keep track of who is allowed to see the data, rather than repeatedly implementing these rules in each application that uses this data.

"Since the applications in this model can push down the security model into the database and companion components, you don't have to solve that in the application," Fuchs said. As a result, "it is a lot cheaper to build that application," Fuchs said.

This is not the NSA's first foray into releasing open-source applications built on the role-based access model. In 2000, the agency released SELinux (Security-Enhanced Linux), which allows administrators to create policies that dictate what actions each program on a computer can execute, based on the user's role. SELinux was subsequently rolled into the mainline Linux kernel.

Joab Jackson covers enterprise software and general technology breaking news for The IDG News Service. Follow Joab on Twitter at @Joab_Jackson. Joab's e-mail address is Joab_Jackson@idg.com

Join the newsletter!

Error: Please check your email address.
Rocket to Success - Your 10 Tips for Smarter ERP System Selection

Tags sqrrlU.S. National Security Agencysoftware

Keep up with the latest tech news, reviews and previews by subscribing to the Good Gear Guide newsletter.

Joab Jackson

IDG News Service
Show Comments

Cool Tech

SanDisk MicroSDXC™ for Nintendo® Switch™

Learn more >

Breitling Superocean Heritage Chronographe 44

Learn more >

Toys for Boys

Family Friendly

Panasonic 4K UHD Blu-Ray Player and Full HD Recorder with Netflix - UBT1GL-K

Learn more >

Stocking Stuffer

Razer DeathAdder Expert Ergonomic Gaming Mouse

Learn more >

Christmas Gift Guide

Click for more ›

Most Popular Reviews

Latest Articles

Resources

PCW Evaluation Team

Walid Mikhael

Brother QL-820NWB Professional Label Printer

It’s easy to set up, it’s compact and quiet when printing and to top if off, the print quality is excellent. This is hands down the best printer I’ve used for printing labels.

Ben Ramsden

Sharp PN-40TC1 Huddle Board

Brainstorming, innovation, problem solving, and negotiation have all become much more productive and valuable if people can easily collaborate in real time with minimal friction.

Sarah Ieroianni

Brother QL-820NWB Professional Label Printer

The print quality also does not disappoint, it’s clear, bold, doesn’t smudge and the text is perfectly sized.

Ratchada Dunn

Sharp PN-40TC1 Huddle Board

The Huddle Board’s built in program; Sharp Touch Viewing software allows us to easily manipulate and edit our documents (jpegs and PDFs) all at the same time on the dashboard.

George Khoury

Sharp PN-40TC1 Huddle Board

The biggest perks for me would be that it comes with easy to use and comprehensive programs that make the collaboration process a whole lot more intuitive and organic

David Coyle

Brother PocketJet PJ-773 A4 Portable Thermal Printer

I rate the printer as a 5 out of 5 stars as it has been able to fit seamlessly into my busy and mobile lifestyle.

Featured Content

Latest Jobs

Don’t have an account? Sign up here

Don't have an account? Sign up now

Forgot password?