SPUR - RISC IV: The LISP Multiprocessor Workstation
The hugely ambitious 'Symbolic Processing Using RISC' project from UC Berkeley
Ten superminicomputer processors for your desk will come packaged as a VLSI workstation called SPUR
Today’s post completes our series on the (numbered) UC Berkeley RISC processors with the most ambitious Berkeley RISC project. Previously in this series, we’ve looked at the first VLSI RISC microprocessor, known (retrospectively) as RISC I, and its successor RISC II.
These designs took ideas from John Cocke and the team at IBM who had created the IBM 801 minicomputer and added some further innovations, as well as shrinking the designs onto a single chip.
We then saw how the next UC Berkeley project in the series, known at the time as SOAR for “Smalltalk On A RISC”, used the RISC concept to develop a microprocessor specialized for running the Smalltalk programming language.
The next project in the sequence - designated later as RISC-IV - was known as SPUR for “Symbolic Processing Using RISC”. The SPUR project included a processor design specialized for a high-level language, Common LISP, with a workstation with multiple processors (a “multiprocessing” system). If this wasn’t ambitious enough, SPUR added hardware support for the, relatively new, IEEE-754 standard for floating-point arithmetic. SPUR was a multi-year project that ran in the second half of the 1980s.
Before we consider the SPUR design itself, it’s worth reminding ourselves that all of these Berkeley RISC designs were research projects undertaken by a mix of students and staff at Berkeley. These teams built designs that were not only innovative but also often outperformed comparable commercial designs. How was this possible? I’d suggest that there were two ‘magic ingredients’ in this.
The first ingredient was the immense talent of the teams working on and directing these projects, including David Patterson, who would make his mark in many other areas besides the RISC projects.
The second ingredient was the RISC idea itself. If it was John Cocke at IBM who came up with the basic RISC idea, then the team at UC Berkeley recognized that the RISC concept allowed even small teams to rapidly create powerful and innovative processor designs.
Having seen what the UC Berkeley team could achieve, many other small teams would go on to build their own RISC designs, including the team at Acorn Computers in Cambridge in the UK, who designed and built the first ARM processor.
Two decades after SPUR the same RISC idea would lead to the creation at Berkeley of the latest, and possibly the most important, of the numbered RISC series projects - RISC-V.
SPUR System
This post will focus on the design of the processor at the center of the SPUR system, but first a little about the wider system that this processor was designed to fit into. From the SPUR papers linked at the end of this post:
SPUR (Symbolic Processing Using RISCs) is a multiprocessor workstation being developed at the University of California at Berkeley to conduct parallel processing research. Its development is part of a multi-year effort to study hardware and software issues in multiprocessing, in general, and parallel processing in Lisp, in particular.
First, although parallel processing hardware has existed for many years, these systems have been difficult to program … Consequently, we are designing SPUR to simplify parallel processing software by providing a single global memory that can be shared, with uniform access times. Implementing a high-performance shared memory system increases the system's hardware complexity, but we believe the shared memory software model facilitates the rapid development of parallel processing software …
Second, hardware is more difficult to design, construct, debug, and modify than most software. Consequently, most SPUR hardware features are simple, frequently-used primitives. We migrate features from software into hardware only if doing so achieves a significant performance gain in return for reasonable design and implementation costs. The complex hardware features included in SPUR either facilitate parallel processing (for example, hardware-based cache consistency) or make large contributions to performance …
The SPUR system would have between 6 and 12 processors connected via a common ‘SPUR’ bus to a shared memory and to input and output devices.
The ambition of the SPUR design meant that each processor wouldn't fit onto a single VLSI die so a processor would consist of three custom VLSI designs and around two hundred other chips.
The custom VLSI designs were the CPU (processor), the FPU (floating point unit), and CC (snooping cache controller):
The processor is 170 mm2, contains 115,214 transistors, was fabbed in a 1.6 micron CMOS, and operates at 10 MHz. It contains a RISC processor designed to run Lisp well. The snooping cache controller is 130 mm2 in the same technology and contains 68,385 transistors. The floating point unit is 130 mm2 in the same technology and contained about 110,000 transistors.
Source where there are also images of these three designs.
SPUR Operating System and LISP
The novelty of SPUR meant that it needed a new operating system: the Unix-like ‘Sprite’:
Sprite is Unix-like but provides simple mechanisms for shared address spaces so that tightly-cooperating processes can make the most of the SPUR hardware. In addition, Sprite provides two important features related to networks: a shared filesystem and process migration.
The expectation was that SPUR would be programmed in LISP with C as an option
The SPUR Lisp system is a Common Lisp system that we are extending with primitives for parallel computation. It is built on the SPICE Common Lisp compiler written at CMU. …For the purpose of supporting "conventional" programming, we are also creating a C compiler for SPUR …
… a LISP that was specially adapted for multiprocessing:
We have designed a set of primitive extensions to Common Lisp for parallel processing and we are in the process of implementing them. These provide processes that share memory, mailboxes for synchronization and well-controlled inter-process communication, and Multilisp-like futures. As multiprocessing also raises the issue of garbage collection, we are developing and measuring algorithms for parallel stop-and-copy garbage collection in which collection and mutation do not overlap, and the collection algorithm itself is parallel.
SPUR Processor Design
Registers
SOAR carried forward many of the ideas of RISC-I and RISC-II. It had thirty-two 40-bit addressable general purpose registers, with eight ‘windows’ giving 138 physical registers. The 40 bits consisted of 32 bits of data plus an 8-bit tag. More on these tags later.
New to SPUR were seven 32-bit special registers including the user and kernel processor status words, register window pointers, and several program counters. Also new were fifteen 87-bit floating point registers and a 32-bit floating point ‘status word’.
Memory
SOAR introduced ‘virtual memory’ with separate memory segments for code, the stack, the heap, and the system. The total virtual memory space was 256 Gigabytes.
Basic Instruction Set
SOAR’s basic instruction set had much in common with RISC-I and RISC-II. A small number of simple instructions that closely followed RISC principles with instructions broken down into a small number of distinct groups:
Arithmetic and Logical instructions operating on the contents of registers, with results stored in registers;
Load and Store instructions with a small number of addressing modes to access memory;
Branch & Jump instructions;
Call & Return instructions;
Instructions to read and write special registers;
With the addition of hardware floating point, SPUR added some instructions to perform IEEE-754 floating point on the new floating point registers.
LISP Adaptations
SPUR used the tags in registers that we saw earlier to speed up LISP operations, with an approach that had a lot in common with SOAR:
Tagged architecture. Lisp uses polymorphic functions with operands whose type is not known until run-time.
SPUR handles polymorphic operations by manipulating the 6-bit data-type tags of operands in parallel with operating on the 32-bit data values … Type checking in SPUR, like several other machines, assumes that most arithmetic operands are integers. … If both operands are integers, the instruction finishes by writing the sum into the result register. Otherwise, the register write is suppressed and the instruction traps to software that determines the types of the operands and performs the appropriate form of addition.
SPUR also used these tags together with a small number of special LISP instructions to speed up the fundamental LISP list operations like car and cdr (car takes the first item of a list and cdr creates a new list with all except the first item):
SPUR provides a car/dr instruction as that checks the data-type tag in parallel with the load. A trap is generated if the type of the operand is inappropriate.
Conclusions
Ten superminicomputer processors for your desk will come packaged as a VLSI workstation called SPUR, once the team at UC Berkeley finds a partner to transfer their upcoming prototype to industry.
Design Decisions in SPUR (1986)
If SPUR looks as good in silicon as it looks in paper, we will hope to find industrial partners to make SPURs available to other interested par- ties. If you know of any likely candidates, please help us make their acquaintance.
A Progress Report on SPUR (1987)
SPUR was ahead of its time in building a multiprocessor system in the mid-1980s. IBM’s POWER4 processor from 2001 was the first multicore microprocessor, with Intel and AMD each following four years later.
Was it too ambitious? The SPUR papers make it clear that the team had the ambition to turn SPUR into a commercial product. However, the mix of multiprocessing, RISC, a new operating system, IEEE-754 floating point, and LISP in a workstation probably made it too niche (and almost certainly too expensive) to be a successful product in the 1980s.
From the perspective of 2024 though, I think that the most appropriate reaction is to marvel at the ambition of the UC Berkeley team, commercially successful or not, and to be equally impressed by how relevant (with the possible exception of LISP) the ideas in SPUR would become decades later.
Further Reading
The SPUR papers are highly readable and strongly recommended for readers who want to know more about the design (Click the image for a link in each case).
SPUR A VLSI Processor Workstation (1985)
A Progress Report on SPUR (1987)
Design Decisions in SPUR (1986) Unfortunately, paywalled.
https://dl.acm.org/doi/abs/10.1109/MC.1986.1663096
You say: "SPUR also used these tags together with a small number of special LISP instructions to speed up the fundamental LISP list operations like car and cdr (car takes the first item of a list and cdr creates a new list with all except the first item):...". Note that cdr doesn't actually create (allocate) anything; it just returns the contents of the "cdr" ("next") field in the cons cell that is its argument. So at the implementation level, car and cdr are almost identical.