Apple's Mac Transitions : 68k to PowerPC to Intel to Apple Silicon
Why and how Apple has changed Mac processor architectures three times
Whether a platform can successfully transition from one processor architecture to another can determine whether it survives. Apple has succeeded in making this CPU transition for the Mac three times now. First, from Motorola 68k to Power PC. Then from PowerPC to Intel. And most recently from Intel to Apple Silicon. These transitions have enabled the Mac not only to survive, but to thrive.
This post looks at why and how Apple has changed architectures, and considers what has made these transitions successful.
We’ll also get an insight into how Apple has changed over the years. From the slightly chaotic multiple projects that were launched to try to move away from Motorola 68k during the 1980s, to the smooth shift to Apple Silicon.
This is a live topic as Qualcomm and Microsoft again try to make Windows on Arm a success, and as Google and others start work on supporting the RISC-V architecture on Android. In the world of AI, many firms are trying to adapt key tools so that they run on multiple architectures. The fate of many firms, and billions of dollars, rest on the success or failure of these projects.
So let’s look at the story of Apple’s Macintosh processor transitions and see what we can learn. We’ll start with the earliest days of the Macintosh.
Motorola 68k Era
The development of the Macintosh (as it was known then) started in 1979 and the first Macintosh was launched in January 1984. The original model used a 16/32-bit Motorola 68000 CPU.
Over the following years, new models were developed as more advanced chips in the 68000 became available. The Macintosh II was launched in March 1987 with a Motorola 68020, paired with a 68881 floating-point coprocessor. This was quickly followed by the Macintosh IIx in September 1988, with a Motorola 68030.
As early as 1986, Apple's management could see the performance advantages of emerging RISC processors. What followed was a succession of projects as Apple’s management tried to choose an architecture to replace the Motorola 68k series.
We’ve seen in The First Apple Silicon – The Aquarius Processor Project (supplemental post here) how Apple first launched a project to develop an RISC processor in-house. The project’s Scorpius specification was highly ambitious, with features such as multicore support and ‘Single Instruction Multiple Data’ instructions. But, despite using a Cray supercomputer to help design it the Aquarius team couldn’t deliver a working design and the project ended in 1989.
So Apple continued to use Motorola 68k series CPUs. The Macintosh Quadra was introduced in October 1991 used a Motorola 68040 CPU. The upgrade path with Motorola wasn’t entirely straightforward though. The Quadra had problems running some software because the 68040 had a separate data and instruction caches. This could cause problems for the self-modifying code used by some older Macintosh programs.
Jaguar and Cognac
Meanwhile, the search for a RISC replacement for the 68k line of processors continued. As we’ve seen in The RISC Wars Part 1 (supplementary post here), almost every semiconductor, mainframe or minicomputer manufacturer had their own RISC designs by the late 1980s. The AMD Am29000, MIPS, SPARC, Intel’s i860 and ARM processors were all in the running as Apple’s 68k replacement.
SPARC, MIPS and i860 were soon rejected. The AMD Am29000 was already being used in an Apple graphics card and some initial work was done on building a system that had both a Motorola 68030 and the AMD Am 29000.
By this time Apple was already co-owner of Advanced RISC Machines Limited, in Cambridge, UK, and was planning to incorporate ARM processors into the Newton handhelds.
In the end though, political pressure to maintain good relations with Motorola meant that Motorola’s 88000 RISC series (also known as 88k) became the favoured choice for the 68k replacement.
So project ‘Jaguar’ was started to work on moving the Mac to Motorola 88k.
One crucial issue was software compatibility. A new Mac with an 88k processor would not, by default, be able to run existing third-party 68k Mac software.
The Jaguar project had decided that backwards compatibility wasn't necessary. So another project was started by John McHenry, codenamed ‘Cognac’ (as RISC pioneer John Hennessy’s surname was also a popular brand of the liqueur). Cognac was to develop a RISC-based Mac using the 88k, but with backwards compatibility with existing software.
Apple engineer Gary Davidian had already been developing 68k emulators for the various RISC architectures that Apple had been considering. With the choice of the Motorola 88k he focused on building a full emulator that would work on 88k based systems.
While Davidian was building his emulator, the Cognac hardware team developed an 88k based version of the Macintosh LC, which would be know as the Macintosh RLC for ‘RISC LC’.
Deal Of The Century - PowerPC
The story would take an unexpected turn in October 1991 when Apple and IBM announced that they would start collaborating across a wide range of projects.
Intel processors were becoming increasingly powerful, Intel-powered PC compatibles were dominating the desktop PC market, and Mirosoft’s Windows 3.1 was about to be launched. All these factors posed a threat to both IBM and Apple's market position.
So the two rivals teamed up to work on operating systems and other projects. The deal became known as ‘DOTC’ for ‘Deal Of The Century’. As part of this collaboration, Apple would adopt IBM’s forthcoming PowerPC processor architecture in the Mac.
PowerPC would be based on the RISC architecture used in IBM’s RS/6000 workstations. Motorola would also join Apple and IBM, in what became known as the ‘AIM’ alliance.
An Intel Interlude – The Star Trek Project
By 1992, Apple and IBM weren’t the only companies who felt threatened by the increasing dominance of the IBM PC compatibles, and Microsoft and Intel.
Novell’s market-leading Netware PC networking software was set to be challenged by the planned launch of Windows NT. So when Novell approached Apple about porting Apple’s Macintosh System 7.1 operating system to Intel processors, the two firms embarked on another collaboration.
The project, codenamed ‘Star Trek’ started on St. Valentine's Day 1992. The project’s engineers were soon able to demonstrate a standard Intel 486-based PC booting Mac OS and running the Mac’s ‘Finder’ file browser.
But, the Intel Star Trek project would be shut down in 1993 and the majority of the engineering staff moved to work on the PowerPC project.
Motorola 68k to PowerPC
With the new alliance with IBM, the Cognac team had to switch from 88k to PowerPC. The project was renamed ‘PDM’ for ‘Piltdown Man’ as it would form the ‘missing link’ between 68k and PowerPC Macs. ‘Piltdown Man’ had been the (later proven to be fraudulent) early 20th century attempt to present fossil evidence for the link between apes and man.
The Jaguar project also switched from 88k to PowerPC and was renamed Tesseract. PDM and Tesseract were now both building competing PowerPC systems.
With the switch from Motorola 88k to PowerPC, Gary Davidian had turned his attention to converting his 68k emulator for the 88k to one that would instead run on the new PowerPC architecture.
The first version of the emulator worked by examining each 68k instruction in turn. It would then execute a series of PowerPC instructions that would have the same effect, effectively an ‘interpreter’ for 68k instructions. This emulator had some limitations: it wouldn’t translate Motorola Floating Point hardware instructions or certain memory management functions of the Motorola CPUs.
For more on the story of Gary Davidian and his 68k emulator there is an extensive Oral History from the Computer History Museum. Part 2 of the interview goes through the work to implement a replacement architecture in lots of detail, including both the hardware and software.
The CHM also has a short article with pictures of some of the hardware prototypes used including the ‘Smurf’ prototype PowerPC cards that were used to test Davidian’s 68k emulator.
The first PowerPC 601 chip based machines were shipped to the Piltdown Man team in late 1992, and just few weeks later a version a PowerPC version of Mac OS was booted. Tesseract was cancelled in March 1993 and PDM became the sole PowerPC project. In March 1994 the first PowerPC based Macs, the 6100, 7100 and 8100 series, were announced in New York.
The Motorola 68k to PowerPC emulator was shipped with all versions of Mac OS Classic for PowerPC. Support for emulation only ended with the transition from Mac OS Classic to OS X.
How successful was the transition? The emulator seems to have worked well enough to support a smooth transition but the approach used wasn't the fastest. Later versions of the software shipped with later Macs were speeded up by using dynamic recompilation. In this approach blocks of code were translated from 68k to PowerPC instructions before being run, a faster approach than the original ‘instruction by instruction’ technique.
Indeed one commercial provider, Connectix, stepped in to offer faster alternative to Apple’s original emulator. The MacWorld review of Connectix Speed Doubler. summarises how it worked:
Connectix Speed Doubler is a collection of three system extensions designed to make your Mac run faster. The centerpiece of the collection, Speed Emulator, works only on Power Macs. This component replaces Apple's 680X0 emulator … with one that runs more efficiently. The Apple emulator in the original Power Macs performed at IIci to Quadra level. Speed Emulator improves that performance with an emulation scheme similar to the one used on the newest Power Macs (such as the 7500, 8500, and 9500), called dynamic recompilation. This emulation technique keeps translated 680X0 code in RAM, so the Mac can use the code again and again without translating it each time, thus improving overall performance. The original Apple emulator translated the code each time it was used, and did not keep it in RAM.
PowerPC to Intel - Too Much Power
The switch from 68k to PowerPC was widely considered a success. For most of the 1990s, the performance of successive generations of desktop PowerPC CPUs competed successfully with contemporary Intel designs.
But the early and mid-1990s were a turbulent period for Apple, featuring crisis after crisis, untimely leading to the acquisition of Steve Jobs’s NeXT. With Job’s return came investment from Microsoft, the launch of the iMac, the transition of the Mac’s operating system to OSX, the success of the iPod and a return to financial stability.
By the early 2000s, though, Jobs and Apple’s management faced another problem with the PowerPC range. Consistent with their naming, the chips were powerful, but they also consumed a lot of power. Laptops were becoming an increasingly vital part of the Mac range, and the poor power consumption of the PowerPC range was holding them back. IBM, with its narrower customer base, couldn’t afford to keep pace with Intel’s investment in developing more power efficient PowerPC processors.
So after years of publicly disparaging Intel CPUs, Jobs announced at Apple’s ‘Worldwide Developer Conference in 2005’ that Apple would be switching from PowerPC to Intel. The announcement is a Job’s presentation masterclass and features a cameo appearance from Intel’s CEO, Paul Otellini.
Jobs joked, whilst displaying a picture of the Pope, that Apple had consulted the highest authorities in their quest to squeeze the PowerPC G5 CPU into a laptop. Now, with Intel they could put the more powerful dual core Intel Core 2 CPU into the new MacBook Pro laptop.
Although the relationship between Apple and Intel was outwardly successful, there were some difficulties from the earliest days. The first Intel Macs were 32-bit ‘Core’ systems apparently due to Intel’s delays in delivering the 64-bit ‘Core 2’ processors. This meant that Apple had to ship 32-bit and 64-bit versions of the OS X operating system in quick succession.
In one sense, Apple did three architecture transitions at about the same time: first from PowerPC 32-bit to PowerPC 64-bit (only applicable to one Mac, the PowerMac G5). Then to Intel 32-bit. And finally, to Intel 64-bit.
During the transition period, developers would need to ship applications that would include code that would run on both PowerPC and Intel (potentially four versions of the code for PowerPC 32-bit, PowerPC 64-bit, Intel 32-bit and Intel 64-bit). The file with these different versions together was known as a ‘Universal Binary’.
Rosetta
Once again, backwards compatibility for existing PowerPC based software would be vital. This time, Apple had a catchy name for the software that would enable PowerPC based programs to run on the new Intel Macs. It would be known as Rosetta, after the ‘Rosetta Stone’, the stone tablet with writing in Egyptian hieroglyphic and Demotic scripts and Ancient Greek that had helped scholars to decipher hieroglyphics.
Much of the software used by Mac users was actually produced by Apple and bundled with the OS X operating system. All this software was ported to Intel and shipped with the first Intel Macs. This included the Safari web browser, Apple’s Mail email client and multimedia applications such as QuickTime, iMovie, GarageBand, Logic and Final Cut.
Third-party packages would have to wait for their developers to convert from PowerPC to Intel. Among the most important software where users would have to wait would be the Microsoft Office Suite and Photoshop and other professional tools from Adobe. These were key tools for many users, and so the compatibility and performance of Rosetta would be crucial for many users.
Rosetta was actually Apple’s name for their adaption of ‘QuickTransit’ technology licensed from startup Transitive. Although Transitive was based in Los Gatos in California, the engineering team was based in Manchester in the UK. They used technology originally developed at the University of Manchester and which was originally known as ‘Dynamite’.
Dynamite, and hence QuickTransit and Rosetta, were, like the later 68k to PowerPC emulators, dynamic recompilation systems. That is, they executed when the program in the original architecture was needed, taking a short sequence of code and translating when needed. These systems were also relatively complex, with translation to an intermediate language and then finally translation to the target architecture.
The benefits of this approach were fast start-up times. Better performance with more optimisations could be generated when a block of code was executed more often than a given threshold.
Transitive’s other products allowed translation from MIPS to Itanium (used by Silicon Graphics) and SPARC to x86 and Itanium. Transitive would be acquired by IBM in 2008. IBM would use the technology to translate in the opposite direction to Apple, from x86 to PowerPC.
How successful was Rosetta? I think it’s reasonable to give it a B+. It was incompatible with some G5 PowerPC instructions. Moreover, some categories of software could not be translated, including some applications built for Mac OS 9, screen savers and kernel extensions.
Even though early Intel Macs were faster than their PowerPC predecessors, running programs under Rosetta led to poorer performance for some programs when compared to running on a PowerPC Mac. This was particularly the case for compute intensive applications such as games or PhotoShop.
However, Rosetta was certainly ‘good enough’. Apple completed the transition to Intel successfully by 2008 and Rosetta was dropped from Mac OS X Lion in 2010.
Intel to Apple Silicon
The most recent Mac transition was also about power, in two senses.
In 2020, Tim Cook announced, at Apple’s Worldwide Developer Conference, that the Mac line would transition to ‘Apple Silicon’. Apple Silicon would be a series of ’System-on-Chips’ (SoCs) designed by Apple’s own in-house silicon design team. The CPU’s would use the Arm Instruction Set Architecture but be designed entirely in-house by the Apple team under an Arm ‘architecture’ licence.
The design team had already proven their capabilities by designing several generations of SoCs for both the iPhone and the iPad.
The first Apple Silicon Macs, the MacBook Air, Mac Mini and a MacBook Pro, based on the M1 SoC would appear in November 2020. The M1 would be followed by the higher performance M1 Pro, M1 Ultra and M1 Max, and then by the M2 and M2 Pro. By June 2023 the transition was complete, with the announcement of a M2 Ultra SoC in the Mac Pro.
Why was the change again about ‘power’? First, it was about power consumption. The first M1-based Mac laptops had markedly better power consumption and battery life than their Intel predecessors. In part, this was due to the use of a more advanced TSMC manufacturing process. By the late 2010s, Intel was stumbling as it attempted to move to a 10 nm manufacturing process. The M1 in the first Apple Silicon Macs used a TSMC 5 nm manufacturing process that led to markedly Bette power efficiency than contemporary Intel designs.
There were also aspects of the architecture itself that led to better power efficiency. The M1 and successors have adopted Arm’s ‘big.LITTLE’ approach of combining bigger, more powerful cores and smaller, more efficient, cores. This provided both performance, when needed, and power efficiency.
Second, it was about Apple’s ‘power’ over the Mac platform. By controlling the development of the processor, Apple extended their vertical integration allowing them to include features in the SoC that closely fitted what they required, rather than what was available from Intel.
For a transition period, Apple would again require developers to supply applications with code compatible with both x86-64 and ARM64 instruction sets in a format known as ‘Universal 2’.
Rosetta 2
Once again, Apple ensured that Apple Silicon-based Macs could run existing Mac Intel software, using a system that was branded as ‘Rosetta 2’.
Although Rosetta 2 adopted similar naming to the software used in the PowerPC to Intel transition, the approach used was very different.
This time, the system normally uses ‘Ahead-Of-Time’ translation from x86-64 to Arm. The entire x86-64 program is translated into an ARM binary when the program first launches, and this translated program is then run. Each x86-64 instruction is translated to one or more ARM64 instructions.
There are a few areas where the translated code gets some help from Apple’s hardware.
Floating Point
x86 and Arm both implement the IEEE-754 floating point standard. However, there are some differences in how the two architectures deal with NaN (‘Not a Number’) values and some aspects of rounding. To replicate this behaviour precisely in software would potentially slow down floating point performance.
Later versions of ARMv8 have an extension that allows the architecture to mimic the x86-64 approach, but the M1 predates this. So Apple incorporated a non-standard floating-point implementation in the M1, to help replicate x86.
Parity and Adjust Flags
The other area where Apple Silicon departs from the Arm standard is in the handling of flags. Flags are bits that are set depending on the result of an instruction. x86 has two flags that don’t map to standard Arm flags; PF for ‘parity’ and AF for ‘adjust’, each of which are affected by common x86 instructions such as integer addition and register comparisons. Using software to set these flags based on the results of these common instructions would be very time consuming.
The M1 has a non-standard instruction set extension that helps deal with this. It computes the values that x86 would store in PF and AF when executing ARM64 instructions like addition, subtraction, and compare. It then stores these values in bits 26 and 27 of one of the M1’s ‘flags registers’. The translated code then accesses these bits, if the corresponding x86 code needed to read the PF or AF flags.
If the ‘Ahead-Of-Time’ translation - for example for code that uses a ‘Just-in-time’ compilation - doesn’t work then Rosetta 2 has the fall-back of using dynamic translation, just like the original Rosetta software.
For more on Rosetta 2, Dougall Johnson has written a terrific post on how Rosetta 2 works.
How successful has Rosetta 2 been? Quoting Dougall Johnson:
Rosetta 2 is remarkably fast when compared to other x86-on-ARM emulators.
Why Has Apple Been So Successful?
We’ve seen really very different approaches to emulation in each of the three major Mac architecture transitions. One instruction at a time for the original Motorola 68k to PowerPC, ‘Dynamic’ translation for PowerPC to Intel, and ‘Ahead-Of-Time’ translation for Intel to Apple Silicon. Apple has gone from using a small in-house team to licensing software to an in-house team again.
There are some common themes that have helped Apple across all the transitions:
Performance : The new architecture has, in each case, offered performance that is markedly better than was the case for the architecture being replaced. This has enabled any performance penalty from the translation to be substantially or completely mitigated.
Vertical Integration : Some key performance critical applications – the Safari web browser, for example – are developed by Apple, so that users will never need to use an emulated or translated version.
Backwards Compatibility (or lack of) : Apple has not maintained a policy of backwards compatibility. The company has been content to remove support for older applications. This has made the age and number of legacy applications that need to be supported smaller than would be the case for Windows, for example.
Developer Expectations : Consistent with this, Apple expects developers to ship new versions of their software with executables that work on both the old and new architectures.
And some points on the Intel to Apple Silicon transition:
Architecture Matters : The nature of the ARM64 architecture, when compared to x86, has made translation easier. For example, ARM64 has more integer registers than x86-64, allowing it to keep the translated state of x86-64 registers completely in ARM64 registers.
Architectural Flexibility: Apple has been able to add instructions to the standard Arm architecture to enable more efficient translation of x86 instructions. Almost complete control of the hardware stack has given Apple an advantage over firms reliant on, for example, Arm’s own designs.
It’s hard not to be impressed by the way in which Apple has been able to switch the Mac between processor architectures. These switches have been undertaken with limited disruption for users. In turn, these transitions have enabled the Mac to survive over and prosper over almost four decades.
References and Further Reading
68k to PowerPC
Computer History Museum on Gary Davidian and the 68k emulator
Gary Davidian Oral History
Part 1 - Text : https://www.computerhistory.org/collections/catalog/102740489
Part 2 - Text : https://www.computerhistory.org/collections/catalog/102781079
Part 1 - Video : https://www.computerhistory.org/collections/catalog/102740488
Part 2 - Video : https://www.computerhistory.org/collections/catalog/102781078
PowerPC to Intel
Transitive
https://www.cnet.com/tech/services-and-software/the-brains-behind-apples-rosetta-transitive/
Intel to Apple Silicon
Rosetta 2
https://dougallj.wordpress.com/2022/11/09/why-is-rosetta-2-fast/
Very good article! It’s impressive how Apple has managed these transitions. One minor correction: “The first Apple Silicon Macs...based on the M1 SoC would appear in November 1990.” should read 2020!
I like how this article is about power consumption, and that Moore's Law hasn't been able to progress without considering power management. https://semiengineering.com/a-power-first-approach/ makes some interesting predictions. I think power-focused design will not only change the expectations of hardware, but allow new types of hardware that isolate compute from the user interface- not just in the cloud, but locally too.