Tock, Tock, Tock, Zen

Clean sheet vs evolutionary processor designs

Sep 13, 2024

∙ Paid

A legacy code base can constrain a firm’s business. In hardware, legacy designs can also hamper a firm’s ability to innovate and compete. In this post, we’ll discuss clean sheet vs evolutionary designs from AMD and Intel and the opportunities for new ‘fresh start’ designs offered by Arm and more recently RISC-V.

When discussing computer architectures, we naturally focus on key characteristics that matter to the user, particularly speed, power consumption, and cost. Over time, as semiconductor manufacturing has advanced, smaller, more densely packed, transistors have permitted additional features that improve performance: more cores, enhanced instruction sets, bigger caches, better branch prediction, and so on. All while keeping power consumption and costs acceptable.

But these extra features come with a price for architecture designers: extra complexity.

Modern CPUs use a lot of transistors. Take AMD’s Zen 4 core, which was released in September 2022.

Zen 4 Compute Die courtesy of Fritzchens Fritz https://www.flickr.com/photos/130561288@N04/53028132976/

In a recent Substack post, Casey Muratori estimated the size of a Zen 4 core to be around 3.8 mm2. Based on the smallest publicly available transistor density for the TSMC’s N5 process used to build Zen 4, that means more than 520 million transistors per core. For comparison, Intel’s Pentium Pro (1995) had just 5.5 million transistors.

This begs our first question: what are all those extra transistors used for?

The first part of the answer is that some things are just bigger. The Zen 4 has 32k instruction and data caches at Level 1 compared to 8k in the Pentium Pro. The Level 2 cache is also bigger, and the Zen 4 has a 4M Level 3 cache, while the Pentium Pro has none.

However, a large portion of the increase represents additional complexity. The Pentium Pro’s instruction set was already substantial in 1995 and the x86 ISA has continued to grow. We can list a few of the additions that have been made to the x86 ISA since then (with the date and the name of the microarchitecture where it was first used):

64-Bit Instructions: x64-64 (AMD’s Opteron 2003)
Advanced Vector Extensions: AVX (Intel’s Sandy Bridge 2011)
Advanced Vector Extensions: AVX2 (Intel’s Haswell 2013)
Bit Manipulation: BMI1 and BMI2 (Intel’s Haswell 2013)
Transactional Synchronization Extensions TSX (Intel’s Haswell 2013)
Advanced Vector Extensions: AVX512 (Intel’s Knight’s Landing 2016)
… and there is a lot more!

Zen 4 includes all this and retains backward compatibility with almost all of the features and quirks of earlier x86 designs, as far back as 1976.

As a side note, having two firms constantly adding new instructions to the x86 ISA must make it complex for the designers at each firm to keep in step with what is happening at the other firm. This contrasts with the single firm that controls the Arm ISA and the open approach to developing the RISC-V instruction set.

So how does AMD keep the cost of designing a CPU architecture like the Zen 4 manageable?

The Chip Letter

Tock, Tock, Tock, Zen

Clean sheet vs evolutionary processor designs

This post is for paid subscribers