Google's latest big-data tool, Mesa, aims for speed

Mesa can hold petabytes of data across multiple servers while fielding millions of updates and queries per day

Google has found a way to stretch a data warehouse across multiple data centers, using an architecture its engineers developed that could pave the way for much larger, more reliable and more responsive cloud-based analysis systems.

Google researchers will discuss the new technology, called Mesa, at the Conference on Very Large Data Bases, happening next month in Hangzhou, China.

A Mesa implementation can hold petabytes of data, update millions of rows of data per second and field trillions of queries per day, Google says. Extending Mesa across multiple datacentres allows the data warehouse to keep working even if one of the data centers fails.

Google built Mesa to store and analyze critical measurement data for its Internet advertising business, but the technology could be used for other, similar data warehouse jobs, the researchers said.

"Mesa ingests data generated by upstream services, aggregates and persists the data internally, and serves the data via user queries," the researchers wrote in a paper describing Mesa.

For Google, Mesa solved a number of operational issues that traditional enterprise data warehouses and other data analysis systems could not.

For one, most commercial data warehouses do not continuously update the data sets, but more typically update them once a day or once a week. Google needed its streams of new data to be analyzed as soon as they were created.

Google also needed a strong consistency for its queries, meaning a query should produce the same result from the same source each time, no matter which data center fields the query.

Consistency is typically considered a strength of relational database systems, though relational databases can have a hard time ingesting petabytes of data. It's especially hard if the database is replicated across multiple severs in a cluster, which enterprises do to boost responsiveness and uptime. NoSQL databases, such as Cassandra, can easily ingest that much data, but Google needed a greater level of consistency than these technologies can typically offer.

The Google researchers said that no commercial or existing open-source software was able to meet all of its requirements, so they created Mesa.

Mesa relies on a number of other technologies developed by the company, including the Colossus distributed file system, the BigTable distributed data storage system and the MapReduce data analysis framework. To help with consistency, Google engineers deployed a homegrown technology called Paxos, a distributed synchronization protocol.

In addition to scalability and consistency, Mesa offers another advantage in that it can run be run on generic servers, which eliminates the need for specialized, expensive hardware. As a result, Mesa can be run as a cloud service and easily scaled up or down to meet the job requirements.

Mesa is the latest in a series of novel data-processing applications and architectures that Google has developed to serve its business.

Some Google innovations have gone on to provide the foundations for widely used applications. For example, BigTable led to the development of Apache Hadoop.

Other Google technologies developed for internal use have subsequently been offered as cloud services from the company itself. Google's Dremel ad-hoc query system for read-only data went on to become a foundation of the company's BigQuery service.

Future commercial prospects for Mesa may be somewhat limited, however, said Curt Monash, head of database research firm Monash Research.

Not many organizations today would need sub-second response times against a body of material as large and complex as Google's, Monash said in an email. Also, MapReduce is not the most efficient way of handling relational queries. That's what's led to a number of SQL-on-Hadoop technologies, such as Hive, Impala and Shark.

Also, typical enterprises should look for commercial or open-source options to keep their data warehouses consistent across data centers before adopting what Google's developed, Monash said. Most new data stores being developed today have some form of multi-version currency control (MVCC), he said.

Joab Jackson covers enterprise software and general technology breaking news for The IDG News Service. Follow Joab on Twitter at @Joab_Jackson. Joab's e-mail address is Joab_Jackson@idg.com

Join the Good Gear Guide newsletter!

Error: Please check your email address.
Rocket to Success - Your 10 Tips for Smarter ERP System Selection

Tags applicationsGooglesoftwaredata warehousing

Keep up with the latest tech news, reviews and previews by subscribing to the Good Gear Guide newsletter.

Joab Jackson

IDG News Service
Show Comments

Most Popular Reviews

Latest Articles

Resources

PCW Evaluation Team

Matthew Stivala

HP OfficeJet 250 Mobile Printer

The HP OfficeJet 250 Mobile Printer is a great device that fits perfectly into my fast paced and mobile lifestyle. My first impression of the printer itself was how incredibly compact and sleek the device was.

Armand Abogado

HP OfficeJet 250 Mobile Printer

Wireless printing from my iPhone was also a handy feature, the whole experience was quick and seamless with no setup requirements - accessed through the default iOS printing menu options.

Azadeh Williams

HP OfficeJet Pro 8730

A smarter way to print for busy small business owners, combining speedy printing with scanning and copying, making it easier to produce high quality documents and images at a touch of a button.

Andrew Grant

HP OfficeJet Pro 8730

I've had a multifunction printer in the office going on 10 years now. It was a neat bit of kit back in the day -- print, copy, scan, fax -- when printing over WiFi felt a bit like magic. It’s seen better days though and an upgrade’s well overdue. This HP OfficeJet Pro 8730 looks like it ticks all the same boxes: print, copy, scan, and fax. (Really? Does anyone fax anything any more? I guess it's good to know the facility’s there, just in case.) Printing over WiFi is more-or- less standard these days.

Ed Dawson

HP OfficeJet Pro 8730

As a freelance writer who is always on the go, I like my technology to be both efficient and effective so I can do my job well. The HP OfficeJet Pro 8730 Inkjet Printer ticks all the boxes in terms of form factor, performance and user interface.

Michael Hargreaves

Windows 10 for Business / Dell XPS 13

I’d happily recommend this touchscreen laptop and Windows 10 as a great way to get serious work done at a desk or on the road.

Featured Content

Latest Jobs

Don’t have an account? Sign up here

Don't have an account? Sign up now

Forgot password?