Nvidia's Embarrassingly Parallel Success

It's not just AI. The company has a moat around 'embarrassingly parallel computing'.

Jun 25, 2023

Nvidia’s Jensen Huang - Source Nvidia Flickr : Attribution-NonCommercial-NoDerivs (CC BY-NC-ND 2.0)

Disclaimer: For once I have to add that this isn’t investment advice. I don’t hold or intend to hold Nvidia stock!

This is the first of a short series of posts that try to put Nvidia’s recent stock price surge into some sort of historical context. Today, why Nvidia isn’t just about AI and graphics.

It’s a commonplace that Nvidia recent ‘overnight’ success has been fifteen years in the making. Nvidia’s stock price rose by 24% in one day after their Q1 2023 earnings and soon after the company briefly topped one trillion dollars in market capitalisation.

This was because they announced a jump in demand for GPUs largely to be used to train Large Language Models (LLMs) such as ChatGPT.

Doug O'Laughlin

Fabricated Knowledge

has a great summary of the background and of Nvidia’s immensely strong position in this market.

Doug describes Nvidia’s ‘three headed hydra’ that any competitor has to beat. The ‘heads’ are GPU hardware, networking and CUDA. Wikipedia defines CUDA as:

CUDA (or Compute Unified Device Architecture) is a parallel computing platform and application programming interface (API) that allows software to use certain types of graphics processing units (GPUs) for general purpose processing, an approach called general-purpose computing on GPUs (GPGPU). CUDA is a software layer that gives direct access to the GPU's virtual instruction set and parallel computational elements, for the execution of compute kernels.

Note the use of ‘general’ in ‘general-purpose computing’ and in GPGPU - which stands for ‘General Purpose computing on GPUs’.

The first version of CUDA was released in 2007. The origins of CUDA lie in work at Stanford University. Here is Ian Buck, who is now Nvidia’s VP / General Manager, Hyperscale and HPC, on the history of CUDA back in 2008:

Now, the timing of the release of CUDA is interesting.

After a number of periods of intense interest and then scepticism (the so-called AI winters) about AI in the period from the 1950s to the 1990s the 2000s saw some of the key breakthroughs that would lay the foundations for the current excitement.

Crucially, this period also pioneering work using GPUs to model neural networks. For example, in 2004 there was the paper ‘GPU implementation of neural networks’ by Kyoung-Su Oh and Keechul Jung. From the abstract:

Graphics processing unit (GPU) is used for a faster arti/cial neural network. It is used to implement the matrix multiplication of a neural network to enhance the time performance of a text detection system. Preliminary results produced a 20-fold performance enhancement using an ATI RADEON 9700 PRO board. The parallelism of a GPU is fully utilized by accumulating a lot of input feature vectors and weight vectors, then converting the many inner-product operations into one matrix operation.

Note the use of an AMD RADEON card here!

But the major leaps forward came with the application of Nvidia GPUs.

Advances in hardware have driven renewed interest in deep learning. In 2009, Nvidia was involved in what was called the “big bang” of deep learning, “as deep-learning neural networks were trained with Nvidia graphics processing units (GPUs).” That year, Andrew Ng determined that GPUs could increase the speed of deep-learning systems by about 100 times. In particular, GPUs are well-suited for the matrix/vector computations involved in machine learning.

2012 saw AlexNet, demonstrating the power of Convolutional Neural Networks (CNNs) in image recognition. AlexNet was a major step forward but wasn’t the first to use GPUs for image recognition:

AlexNet was not the first fast GPU-implementation of a CNN to win an image recognition contest. A CNN on GPU by K. Chellapilla et al. (2006) was 4 times faster than an equivalent implementation on CPU.

But by 2012 CUDA was available and so AlexNet used CUDA. From then on GPUs became an essential tool for machine leaning and CUDA became by far the most popular way of using GPUs for machine learning.

Let’s turn now to mining Bitcoin and other crypto-currencies which has been one of the other high-profile (and controversial) use of CUDA in recent years.

Wikipedia says this about the timing of early work on Bitcoin by ‘Satoshi Nakamoto’:

Nakamoto stated that work on the writing of the code for Bitcoin began in 2007. On 18 August 2008, he or a colleague registered the domain name bitcoin.org, and created a web site at that address. On 31 October, Nakamoto published a white paper on the cryptography mailing list at metzdowd.com describing a digital cryptocurrency, titled "Bitcoin: A Peer-to-Peer Electronic Cash System".

So Nvidia’s first release of CUDA pre-dates the famous Bitcoin paper.

This all tells us that Jensen Huang and Nvidia could have had no idea that either of these two major use cases for CUDA would be so important, let alone that they would be as valuable for Nvidia as now seems likely.

But then CUDA was always to enable General Purpose computing on GPUs.

There are a whole class of computing problems that are embarrassingly parallel (meaning that it’s obvious that they can be split into sub tasks, given to a number of processors and run in parallel). Modelling Neural Networks and mining Bitcoin are just the two most important so far.

There are many important classes of problems that fall into this category. And we are discovering more all the time.

Computational biology, fluid dynamics, financial modelling, cryptography are all important fields where computational work is often embarrassingly parallel.

So we shouldn’t say that Nvidia has built a ‘moat’ around AI. Rather that it has a moat around ‘embarassingly parallel computing’.

I still think AI will be the most of valuable of these tasks, but even if it disappoints there is still potential in huge potential in a wide range of other markets and problem domains.

This was always a sensible bet for Nvidia to take. Why? Because the world is very, very often embarrassingly parallel.

What’s more there is a virtuous circle where investment in AI leads to more powerful hardware and software that can then be applied to other parallel problems. For example, CUDA isn’t that easy to use, so we have startups, such as Modular, developing innovative tools for AI which can be used for other problems too. This greater power helps make other tasks more accessible and widens the range of problems where CUDA and Nvidia’s hardware can be used.

Finally, a reminder that Nvidia wasn’t the hugely valuable company that it is today at the time that it started to invest in CUDA. The financial crisis started in 2008 and Nvidia’s market cap touched as low as $4 billion at the end of 2008. Still Nvidia continued to invest in CUDA and GPGPU computing.

At this time Intel’s market cap was around $80 billion. If the iPhone and mobile was a huge miss for Intel in 2007 then perhaps ‘embarrassingly parallel computing’ was an equally large miss for a company that dominated datacenters.

For more on the history of Nvidia as a GPGPU company (although I would perhaps quibble a little though with describing it as The Machine Learning Company), then the Acquired Podcast is an informative (and long) listen.

The Chip Letter

Discussion about this post