Nvidia chief scientist: CPUs slowed by legacy design

Bill Dally forecasts a time when GPUs, not CPUs, will do most computer work

When it comes to power-efficient computing, CPUs are weighed down by too many legacy features to outperform GPUs (graphics processing units) in executing common tasks in parallel, said the chief scientist for the GPU vendor Nvidia.

CPUs "burn a lot of power" executing tasks that may be unnecessary in today's computing environment, noted Bill Dally, chief scientist and senior vice president of research for Nvidia, during his keynote Wednesday at the Supercomputer 2010 conference in New Orleans..

The GPU "is optimized for throughput," while "the CPU is optimized for low latency, for getting really good thread performance," he said.

Dally pointed to some of the features that most modern CPUs posses that waste energy in their pursuit of low latencies.

"They have branch predictors that predict a branch every cycle whether the program branches or not -- that burns gobs of power. They reorder instructions to hide memory latency. That burns a lot of power. They carry along a [set of] legacy instructions that requires lots of interpretation. That burns a lot of power. They do speculative execution and execute code that they may not need and throw it away. All these things burn a lot of power," he said.

Although the GPU was originally designed for rendering graphics on the screen, vendors such as Nvidia and Advanced Micro Devices are now positioning their GPU cards as general computation engines, at least for workloads that can be broken into multiple parts and run in tandem.

At least some industries are taking note of this idea, notably the world of high performance computing (HPC). Earlier this week, China's newly built Tianhe-1A system topped the latest iteration of the Top 500 List of the world's most powerful supercomputers. That system includes 7,168 Nvidia Tesla M2050 GPUs in addition to its 14,000 CPUs. Nvidia claims that without the GPUs, the system would need almost four times as many CPUs, twice as much floor space and three times as much electricity to operate.

And although Dally focused his remarks on use in HPC, he said that the general idea will permeate the computing world as a whole.

"HPC is, in many ways, an early adopter, because they run into problems sooner because they operate at a larger scale. But this applies completely to consumer applications as well as to server applications," he said, in an interview following the keynote.

Dally said that while not many current applications are written to run in parallel environments, eventually programmers will move to this model. "I think over time, people will convert applications to parallel, and those parallel segments will be well-suited for GPUs," he said. He even predicted that systems will one day be able to boot off the GPU as well as the CPU, though he said he knows of no work in particular to build a GPU-based operating system.

Factoring in energy use is one of Dally's crucial tenants for claiming GPU superiority. He noted that while the next-generation Nvidia GPU architecture, nick-named Fermi, would consume 200 pJs (picojoules) in power for each instruction executed, a CPU consumes 2nJ (nanojoules), or an order-of-magnitude more joules.

This tiny difference will amount to a huge chasm when amplified across large systems. Dally pointed to the U.S. Defense Advanced Research Projects Agency's efforts to fund development of an exascale computer, or a computer that can execute 1 quintillion calculations per second. Such a system built from CPUs alone, he argued, would require a "nuclear power plant built next door" just to operate in terms of energy use.

Not everyone in the HPC community is completely sold on the idea of using GPUs as a substitute for CPUs. One potential problem many point to is that while GPUs may have greater throughput, it is difficult for systems to provide that much data to these processors.

"There is very little amount of memory that is available to each of the GPUs. If you have something really fast, you need to feed it really fast, and if you don't have enough memory to feed that processor, that processor will just sit there and wait," Dave Turek, head of IBM's deep computing division, said last week.

Dally said that this bandwidth problem is not unique to GPUs--CPUs face the same dilemma. "Bandwidth is a big problem for any computing system," he said. He admitted the problem is more acute for GPUs, though. Nvidia's just-released GTX 580 card has a raw bandwidth of 200 gigabytes per second, whereas a "top-of-the-line" CPU has only about 35 gigabytes per second. "Memory systems need to evolve to be more efficient," he said.

Joab Jackson covers enterprise software and general technology breaking news for The IDG News Service. Follow Joab on Twitter at @Joab_Jackson. Joab's e-mail address is Joab_Jackson@idg.com

Join the newsletter!

Error: Please check your email address.
Rocket to Success - Your 10 Tips for Smarter ERP System Selection

Tags supercomputersserversprocessorshardware systemsnvidiaComponentsGraphics boardsHigh performanceClustersBlades

Keep up with the latest tech news, reviews and previews by subscribing to the Good Gear Guide newsletter.

Joab Jackson

IDG News Service
Show Comments



Victorinox Werks Professional Executive 17 Laptop Case

Learn more >



Back To Business Guide

Click for more ›

Most Popular Reviews

Latest Articles


PCW Evaluation Team

Louise Coady

Brother MFC-L9570CDW Multifunction Printer

The printer was convenient, produced clear and vibrant images and was very easy to use

Edwina Hargreaves

WD My Cloud Home

I would recommend this device for families and small businesses who want one safe place to store all their important digital content and a way to easily share it with friends, family, business partners, or customers.

Walid Mikhael

Brother QL-820NWB Professional Label Printer

It’s easy to set up, it’s compact and quiet when printing and to top if off, the print quality is excellent. This is hands down the best printer I’ve used for printing labels.

Ben Ramsden

Sharp PN-40TC1 Huddle Board

Brainstorming, innovation, problem solving, and negotiation have all become much more productive and valuable if people can easily collaborate in real time with minimal friction.

Sarah Ieroianni

Brother QL-820NWB Professional Label Printer

The print quality also does not disappoint, it’s clear, bold, doesn’t smudge and the text is perfectly sized.

Ratchada Dunn

Sharp PN-40TC1 Huddle Board

The Huddle Board’s built in program; Sharp Touch Viewing software allows us to easily manipulate and edit our documents (jpegs and PDFs) all at the same time on the dashboard.

Featured Content

Latest Jobs

Don’t have an account? Sign up here

Don't have an account? Sign up now

Forgot password?