Ryzen Threadripper Review: AMD's monster stomps on other CPUs
- 10 August, 2017 23:00
AMD’s 16-core, 32-thread Ryzen Threadripper 1950X ($999 on Amazon) is an angry Godzilla stomping his way through downtown Tokyo. Those puny 8-core, 6-core and 4-core CPUS? They’re just tanks and army trucks to be punted across the city.
Yes, it’s that good.
But before you buy, there’s a lot you need to know about what is arguably the most powerful consumer CPU ever unleashed upon mankind.
What is Threadripper
While Intel currently builds its CPUs around a monolithic piece of silicon for all of its cores, AMD has designed Ryzen to be modular at the chip level. The basic building block of all Ryzen CPUs are two 4-core complexes, or CCXes, joined by AMD’s high-speed Infinity Fabric interconnect. Every Ryzen 7, for example, has an 8-core die such as the one below.
To get to 16 cores in Threadripper, AMD uses the same high-speed Infinity Fabric to join two 8-core dies. The 12-core version also joins two 8-core dies, but each of the 4-core CCXs has one processor core disabled.
But wait: You’ve seen pictures of the inside of a Threadripper and there are four chips—are those two other 8-core dies just waiting to be enabled? Nope. It’s no secret that Threadripper reuses hardware from AMD’s 32-core, server-focused Epyc CPU, but AMD isn’t giving us 32-core consumer CPUs today. Two of those “chips” are actually dummy pieces to add structural support for the cooler that will be clamped onto the CPU.
With great cores, come great resources
AMD actually doubles down twice with Threadripper specs, giving you double the amount of CPU cores and double the amount of memory channels. It also vastly increases the PCIe lanes.
For example, the mainstream Ryzen line supports dual-channel DDR4 memory. Threadripper supports quad-channel DDR4. Unlike Intel, whose strategy is to disable features on its Core-series CPUs to push people to its pricier Xeon chips, AMD leaves in support for ECC RAM to help correct single-bit errors. AMD also says Threadripper should technically be able to support up to 2TB of RAM, although the company hasn’t validated this because there are no DIMMs that support the capacity yet.
As for PCIe, while the mainstream Ryzen chips offer a pedestrian 20 lanes for support of graphics cards or SSDs, Threadripper offers a whopping 64 lanes. Of those 64, four are used to connect to the south bridge, leaving 60 available to connect up to seven different simultaneous PCIe devices. That means up to four GPUs along with three NVMe PCIe drives.
Compare AMD’s generous approach to Intel’s careful rationing: The $1,000 10-core Core i9-7900X, for example, has a decent 44 lanes of PCIe, but the $599 8-core Core i7-7820X has only 28. Even the company’s cheapest Threadripper so far, the 8-core Threadripper 1900X, features a full 64 lanes of PCIe support.
Despite many unsubstantiated rumors of a large lineup of Threadripper CPUs, AMD is officially launching only three CPUs today (the 8-core Threadripper 1900X will ship in a few more weeks). The lineup (see below) is sparser than Intel’s currently, but an unintentional leak by motherboard vendors indicates the company has lower-wattage, non-“X” versions coming, too.
Intel’s own lineup looks more impressive, but thus far, the company has shipped only the 10-core Core i9-7900X and its 8-core, 6-core, and 4-core siblings.
Installation: Read the manual. Seriously.
No matter how many systems you’ve built, if you buy Threadripper, do yourself a favor and read the manual. As expected, Threadripper brings a new CPU socket officially called sTR4. While the mainstream Ryzen features the pin grid array familiar with AMD fans, Threadripper moves to an LGA, or land grid array, that will be more familiar to Intel fans.
LGA moves the delicate pins to the motherboard instead of the CPU. Which is better? From a customer point of view, it probably depends. Mash a pin on a $550 motherboard badly, and you trash the motherboard. Mash it on a $999 CPU, and you trash the CPU.
One thing we do know: Installing a Threadripper is unlike anything you’ve done before. That doesn’t mean you need to sweat bullets, but don’t just dive into it without reading the documentation and watching a proper installation video (preferably not ours, which we did dead-tired and blind) first.
The three essential takeaways from your manual-reading and video-watching should be these:
- You must keep the plastic orange carrier on the CPU. The CPU can’t be installed without it.
- You must use the torque wrench that’s packed into the bottom of the Threadripper box (see above).
- Pay attention to the correct sequence for installing and uninstalling the CPU.
To install it, you open the socket by loosening three T20 Torx screws with the AMD-provided wrench. Remove the top-level protective plate and insert the entire CPU with the orange plastic carrier. Slide the CPU until it clicks into place or is clearly at the bottom of the assembly.
Once you’re sure the CPU is in the carrier correctly, remove the protective cover over the socket and gently lower the CPU into place. Finally, you carefully tighten all three Torx screws with the provided AMD torque wrench.
One more time: Don’t try to muddle through this without at least familiarizing yourself with the process.
Meet the new Game Mode
Before we get to the all-important performance section, you should know about Threadripper’s new Gaming Mode. Most people don’t buy 16-core CPUs to play video games, but the world is a-changing, and many professional gamers and streamers need the ability to play games at high frame rates and also edit the content once it’s done.
When it designed Threadripper, AMD says it realized the high-thread-count CPU didn’t always perform at its best for some games. Remember, it’s made using two separate chips, each with its own dual-channel memory controller. Out of the box, Threadripper supports Uniform Memory Access mode, which spreads the memory access between both memory controllers. The benefit is greater memory bandwidth, but often less latency. Some games, AMD says, just want low latency.
To address this, AMD has introduced a new Gaming Mode that switches the system to Non-Uniform Memory Access (NUMA), or what AMD calls Local Mode. Local Mode essentially shunts all memory access to one memory controller to lower latency. Memory access that goes to the other memory controller is possible, but it’s done with less latency.
There is such a thing as too many cores
Threadripper’s crazy core count has another unintended consequence: AMD says some older games crashed in its tests. This isn’t a problem with Threadripper, AMD notes, but the games themselves, because they just can’t handle the number of CPU cores.
To address this problem, AMD has introduced Game Mode, which essentially tells Windows to recognize only 8 of the 16 CPUs in the system. An updated Ryzen Master Utility lets you switch between Game Mode when it’s needed for older games, and Creator Mode when you want all of your CPU cores and more memory bandwidth.
Does it work? Yes. Although we won’t get into its impact on gaming until later, we did measure the modes’ impact on latency and memory bandwidth. You can see how Game Mode lowers memory latency in the chart above.
As you can see in the next chart, however, Game Mode has the opposite effect on memory bandwidth. Because Game Mode enables NUMA/Local Mode, you give up a significant amount of memory bandwidth
What’s right? Well, it’s complicated. Gears of War Ultimate, AMD says, likes low memory latency, so Game Mode should be on for that game. Rise of the Tomb Raider likes more CPU cores, so maybe you’ll want it off. Far Cry 4 likes low core-to-core latency, so maybe you’ll want to switch on Game Mode.
If this all sounds way too complicated when you just want to play a game, know that for the most part this is just being nit-picky. Any modern game paired with a modern powerful GPU and a Threadripper CPU will run fine at normal resolutions and visual quality settings. AMD just wants gamers to have more granular control so they can wring more performance out of the new CPU. Some may be put off by this complexity, but if you’re really buying a 16-core, 32-thread CPU just for conventional gaming, you’re probably using it wrong. A regular Ryzen or Kaby Lake CPU is probably better for that purpose.
Time for performance numbers! Keep reading.
How we tested
None of this matters without solid performance. Our Threadripper 1950X was tested with an Asus ROG Zenith Extreme X399 motherboard, a ThermalTake Floe Riing (yes, that’s how it’s spelled) 360 cooler, Nvidia GeForce GTX 1080, Samsung 960 Pro SSD, and 32GB of DDR4/3200 RAM.
These last two items actually differ from our standard configuration, which is a HyperX Savage SATA SSD and 32GB of DDR4 at JEDEC 2133 speeds. To minimize the impact of the SATA SSD versus a PCIe SSD, we used a HyperX Savage SSD as the target and source drive for any tests where storage might have an effect—primarily our encoding test using Handbrake and Adobe Premiere Creative Cloud.
The memory configuration was a little stickier, as we’ve tested previous CPUs with all DIMM slots filled. On Ryzen, that limited the memory clock speeds, as only JEDEC speeds are allowed when fully loaded with RAM. That hurts Ryzen, particularly because Infinity Fabric is directly tied to the speed of the memory controller.
Getting the Core i9-7900X to run at DDR4/3200 was also problematic, as it effectively overclocks the CPU to 4.3GHz on all cores on our BIOS. So, instead, we ran the Core i9 system using the XMP default setting of DDR4/2666.
This isn’t ideal, but as memory increasingly ties itself to a platform’s performance, we’ll have to continue to search for a happy medium.
Pull up a chair, because everyone wants to see how Threadripper does on pretty much everything. We're going to start with single-application performance and some synthetic benchmarks, then move on to multitasking and gaming.
Let’s kick this off where AMD started the Zen hype train almost exactly a year ago: Blender. This is an open-source 3D modelling application that actually gets decent use by indie films for effects scenes. Heck, even NASA uses it for its models these days. Blender loves CPU threads, but we’ve found that it doesn’t always scale as well as commercial products such as Maxon’s Cinema4D. Still, more cores generally means more performance, and the first win goes to Threadripper 1950X for handily rendering Mike Pan’s popular BMW benchmark file 22 percent faster than the 10-core Core i9-7900X.
We know we just said Blender doesn’t always scale perfectly, but when you look at the score from the 8-core, 16-thread Ryzen 7 1800X compared to the 16-core, 32-thread Threadripper 1950X, Threadripper takes just over half the time to render the image.
Our second test is also free: the Persistence of Vision Ray Tracer. This application dates all the way back to the Amiga but is continually updated and supported. It’s no surprise, but ray tracing is a CPU-intensive task, and throwing more CPUs at it makes it go faster.
Against the Core i9-7900X, the Threadripper 1950X is 35 percent faster running the internal performance benchmark. Against the 8-core Ryzen 7 1800X, you’re looking at an 85-percent performance boost. From a multi-threading point of view, it’s all win here for the Threadripper 1950X.
Before you pop the champagne, let’s also see how the Threadripper 1950X does in POV-Ray when only a single thread is used. Once that happens, this turns into a battle of overall clock speed and IPC or micro-architecture efficiency. When it’s all about single-threaded clock speeds, it’s all about Intel’s 7th Gen Core i7-7700K, which jumps to the front. Skylake-X, with its very high Turbo Boost Max 3.0 cores, comes in second. Threadripper pulls in about 14 percent slower than the Core i9-7900X, which is within striking distance.
CineBench R15 Performance
CineBench R15 benchmark is based on the same engine Maxon uses in its Cinema4D professional application. Like the previous two applications, it’s all about the thread count, so again Threadripper 1950X runs away with it with a score almost 39 percent faster than the Core i9-7900X.
As with POV-Ray, we also run CineBench R15 single-threaded to get another dimension on CPU performance. When you’re talking CPU efficiency or IPC and high-clock speeds, the momentum again shifts back to Intel’s 7th-gen Kaby Lake CPUs and Core X, though Skylake-X is still just 12 percent faster than Threadripper. Note, too, that Threadripper 1950X manages to hang tight with Intel’s original consumer 10-core, the Core i7-6950X, which cost $1,723 when released.
Corona Renderer Performance
One final rendering benchmark is the newish Corona Renderer test. It’s a new plug-in renderer for Autodesk 3ds Max and is touted as being “unbiased” and high-performance. As the test is new to us, our sample set is extremely limited. Given that AMD is promoting it, however, the results aren’t surprising: Corona Renderer loves CPU cores and gives the Threadripper 1950X a 21-percent advantage.
Geekbench 4 Performance
Geekbench is one the most popular free benchmarks around. We don’t typically use it to gauge performance of desktop CPUs. It’s recently been gaining some traction, however, as the latest version does away with many of the controversial aspects of the previous version.
The results put the Threadripper 1950X in front, but not by much, never mind its six-core advantage. Does this mean Threadripper 1950X isn’t as fast as the previous tests show? No. More than anything else, it probably shows that Geekbench 4.04 doesn’t scale with available core count. Or that it just doesn’t like something about the Ryzen design, as the Ryzen 7 doesn’t do well either.
And no, this isn’t the older 4.04 version. The latest 4.1 version did add some updates for AMD’s micro-architecture, but apparently not enough.
Speaking of applications that seem to have no love for Threadripper 1950X, here are the results of WinRAR 5.40’s internal benchmark. There's no mixup: WinRAR just doesn’t perform very well on Threadripper.
WinRAR seems to be fair in that it doesn’t like Skylake-X much either, instead putting the two Broadwell-E chips clearly in front. Why? At the time of our Core i9 review, Intel said its analysis showed the new mesh design on Core i9 is the issue. The mesh design makes it easier and faster for Intel to connect multiple cores, but there is a penalty in WinRAR and some games as well.
Intel’s new mesh is similar to AMD’s Infinity Fabric in some ways, so it’s entirely possibly WinRAR is revealing an Achilles heel in both designs.
Fret not, AMD fans: The good news is you can just use 7-Zip, because it’s all roses there. Threadripper 1950X is again large and in charge with a 22-percent lead over its Core i9 nemesis. Although 7-Zip doesn’t scale as well as the 3D tests, Threadripper’s still a healthy 73 percent faster than the 8-core Ryzen 7 1800X.
One other test that AMD has touted is VeraCrypt. Based on TrueCrypt, VeraCrypt picked up where its popular free predecessor fell apart. As it’s new to us, our sample set is tiny, but it shows a whopping 45-percent advantage for Threadripper.
Adobe Premiere Creative Cloud 2017 Performance
Besides 3D rendering, video encoding is one of the top reasons people buy mega-core chips. To test that we take a short project PCWorld’s video team shot on a 4K Sony Alpha camera and export it using the 1080p Blu-ray preset, with the maximum render quality option checked.
Both the target and the source of our test doesn’t actually reside on the PC’s local drive. Instead, we store it on a Plextor M8e PCIe SSD that’s moved from machine to machine for testing. This essentially makes the storage subsystem irrelevant in the performance discussion.
The first result uses CPU encoding rather than the CUDA engine on the GPU. Scoff all you want—many professionals still say CPU encoding gives you the best quality.
The results put the Threadripper 1950X in front with a score about 20 percent faster than the Core i9-7900X's. Against the 8-core Ryzen 7 1800X, it’s roughly 39 percent faster. Our actual encode times are relatively short given the video project’s short duration, but any professional who'd like to shave off 20 percent over a 10-core on a 5-hour encode would likely pay for it.
For those who use the GPU, we ran the same project using the GTX 1080 for the heavy lifting. Our encode times are drastically cut down using GPU rendering, but if you think the CPU doesn’t matter guess again. Over an 8-core CPU, for example, the Threadripper 1950X is still 37 percent faster. What professional wouldn’t want to cut down on the time waiting for an encode to finish?
Our last encode test uses the free and popular Handbrake encoder to convert a 30GB file using the Android Tablet preset. Handbrake tends to love CPU cores and threads, but we’ve found the scaling starts to peter out as you approach crazy amounts of cores. In this test, the 16-core Threadripper 1950X is “only” 15 percent faster than the Core i9 chip. Still, when you notch a win, you notch a win.
You want multi-tasking and gaming performance? You got it on the next page.
Before we move on to gaming performance, we wanted to present how well these CPUs do when tasked with running multiple, multi-threaded workloads. For that we decided to run Blender and Cinebench simultaneously. Multi-tasking tests can be difficult to pin down. AMD recommends manually setting the affinity of each core to the various applications to increase the reliability of the results. Most people won’t do that, however, so we decided to see if we could obtain repeatable results just by clacking off one benchmark and then another.
We found by running Cinebench first, and then starting Blender and keeping it in the foreground, we could obtain easily repeatable results. The results here are the average of three runs each, but we could reproduce the results days later.
Note that in the chart below, the Blender and Cinebench results have opposite scales. For the Blender test (in blue), a shorter bar is a faster and better score. For the Cinebench test (in red), a longer bar is a faster and better score.
Threadripper 1950X, with 32 threads at its disposal, finished the Blender render about 19 percent faster than the Core i9-7900X. In CineBench R15 it was 46 percent faster than the Core i9-7900X.
Do people really buy $1,000 mega-core CPUs exclusively to play games? Probably not, but how well each CPU performs in gaming benchmarks is still an important metric for many (fortunately or unfortunately, depending on your point of view.)
3DMark FireStrike Performance
First up is the venerable 3DMark FireStrike test. This is a bit old and mostly a GPU test, but the overall score factors in CPU performance too. The overall winner is the 10-core Core i7-6950X (there’s the reason it cost $1,723). A close second is the Threadripper 1950X in Creator Mode. We did flip the switch for Game Mode and performance dropped. Why? Remember, Game Mode tells Windows it has access to only eight cores, so 3DMark uses only eight of them.
Drilling down into the Physics test results, you can see the direct impact of Game Mode. With all cores used by Windows, Threadripper pulls out in front. With half of them off in the OS, it’s just slightly faster than an 8-core Ryzen 7 1800X.
This isn’t a knock on AMD’s Game Mode, but clearly, for games that really need more CPU cores, set it to Creation Mode instead.
Tomb Raider Performance
Moving on to a real game, Tomb Raider with a GTX 1080 at 1080p and set to Ultimate really doesn’t care about the CPU, as it’s purely a GPU test. Why would we say that? If you look at the results of the elderly FX-8370 CPU, which was a dog in every single test, it’s humming nearly as well as the $1,000 CPUs.
Tom Clancy’s Rainbow Six Siege Performance
Moving on to something newer, we used Tom Clancy’s Rainbow Six Siege at 1920x1080 resolution and the medium-quality setting to make it less about the GPU.
The Threadripper 1950X gives us the familiar pattern we’ve seen in Ryzen-Core face-offs in the past: A 10 percent or more difference in performance in Intel’s favor. Considering that we’ve seen 15 to 20 percent in some games in the past at low resolution this isn’t bad. But can Game Mode make a difference?
Yes. With Game Mode activated, the gap between the top-performing Core i7-6950X and Threadripper closes to about five percent. That's also right on the heels of the Core i9-7900X, which likely takes a performance hit from its mesh interconnect.
Rise of the Tomb Raider Performance
Just as in Rainbow Six, the Threadripper 1950X’s performance in Rise of the Tomb Raider is a bit underwhelming in Creation Mode, but flip on Game Mode and it’s a ball game. Well, at least it’s in the ball game and trying.
Ashes of the Singularity: Escalation Performance
Our last gaming test is Ashes of the Singularity: Escalation, which is the poster child for how to use a CPU in gaming. Unfortunately for AMD, it’s all about Intel here: The Core i9-7900X has a commanding lead over Threadripper 1950X. Game Mode helps inch the Threadripper 1950X closer, but Intel still wins by 12 percent.
Why? Some of it is pure clock speed differences, but we expected this to be closer, especially considering that the developer of Ashes was one of the first to optimize for Ryzen earlier this year. However, game optimization has not yet proven to be the cure-all AMD promised. In the grand scheme of things, this isn’t a big deal, but it’ll vex AMD fans.
In this short-attention-span world, you’re probably looking for the quick answer. Unfortunately, the real answer has three parts.
The first is single-threaded or very lightly-threaded use, such as most photo-editing applications. In that category, the more spry quad-cores outpace Threadripper 1950X, though its relatively high clock speeds keep it very much in contention.
The second is gaming, where we see the familiar deficit from previous Ryzen launches. AMD argues that in social gaming, such as streaming and recording while gaming, more cores are better—and we’d tend to agree. But in conventional gaming, Intel leads. The good news is the new Game Mode can help close that gap to the point that it doesn’t even matter.
That brings us to the last category: multi-threaded performance. In every single multi-threaded test we ran (including multi-tasking multi-threaded tests), Threadripper 1950X outpaced all comers by significant margins. It simply destroys any 8-core CPU and makes you question how the 10-core Core i9-7900X can dare to be priced the same as the Threadripper 1950X.
This last point is very much the entire reason for Threadripper 1950X’s existence. Frankly, no one should buy a $1,000, 16-core CPUs just to play conventional gaming or run lightly-threaded applications. It’s the wrong tool for the job.
You buy a 16-core CPU for work. Real work. Real work means modelling, encoding, and doing five things simultaneously, because it’s work.
For that, Threadripper 1950X is an incredible breakthrough in performance and cost. Just four years ago, consumers paid $1,000 to get a 6-core CPU. Today, the same $1,000 gets you 16 cores. That’s something to be applauded loudly by anyone who cares about performance.