Artificial Intelligence and Machine Learning

Nvidia at the Center of the Generative AI Ecosystem—For Now

Assessing the ascent of Nvidia.

Posted Jan 8 2024

Why Has Nvidia Dominated the GPU Market?
Why Are GPUs so Useful for AI/ML and Generative AI Applications?
What Has Kept Nvidia’s Sales and Profits so High?
What Competition Does Nvidia Face in Hardware?
What Competition Does Nvidia Face in Software?
Conclusion
References

Generative AI has attracted worldwide attention as a foundational technology with almost unlimited applications (see my previous column, “Generative AI as a New Innovation Platform,” Communications, October 2023). Goldman Sachs estimates applications of this technology could raise global GNP by $7 trillion (7%) over the next decade.⁹ At the center of the new ecosystem is Nvidia, whose high-end graphical processing units (GPUs) account for approximately 80% of the market for GPUs that power generative AI software.⁵^,²⁰

Established in 1993, Nvidia’s founders, led by Jen-Hsun (Jensen) Huang, initially saw a need for powerful specialized chips that could take over graphics processing from PC or workstation central processing units (CPUs), a market dominated by Intel. The company went public in 1999 and exceeded $1 billion in revenue in 2002. In its most recent quarter, Nvidia reported sales of $13.5 billion (double the prior year) and net profits of $6.2 billion. It is now the world’s most valuable semiconductor company, with a market cap surpassing $1 trillion, compared to $159 billion for AMD and $154 billion for Intel, its top competitors. This column explores several questions behind Nvidia’s extraordinary history.

Why Has Nvidia Dominated the GPU Market?

First, Nvidia early on introduced architectural innovations that made its GPUs the hardware of choice, initially for gaming and then for many other applications. The key product introduction dates to 2006, with the G80 Tesla series GPU. Nvidia switched from arrays of a few specialized compute cores (sub-processors) that could perform complex tasks independently, as in a CPU, to an array of many more simple cores running twice as fast or faster. Each core could handle a few pixels on a graphics display or perform many specific tasks in parallel. This new design was 100% faster than Nvidia’s previous generation. Its next Fermi microarchitecture, released in 2010, was eight times faster, with many more compute cores.³¹

Second, also in 2006, Nvidia introduced a new programming model and language for its GPUs with a free software development kit (SDK) called CUDA, for Compute Unified Device Architecture. CUDA started as an extension of C/C++ to support fast parallel processing by directly accessing instruction sets in the GPU hardware.⁸ An abstraction layer isolated the software from other underlying hardware, enabling CUDA to run on different PCs, workstations, and servers—as long as they incorporated Nvidia GPUs as graphics cards or part of the server stack. Although not a direct comparison of device speeds, according to Nvidia data from 2006-2008, programs written using CUDA with its next GeForce 8 series GPUs were 100 to 400 times faster than programs running on the general-purpose Intel Xeon CPUs.³¹

Third, we keep finding new ways to deploy GPUs as accelerators, and Nvidia has facilitated this expansion of use cases with industry-specific versions of CUDA.¹⁶ We now see Nvidia GPUs not only in gaming, artificial intelligence and machine learning (AI/ML), and generative AI software, but also in cryptocurrency mining, virtual reality applications, self-driving vehicles, robotics, and datacenter cloud services. In 2023, gaming was still the company’s largest single source of revenue (18%), though datacenters accounted for half of revenues and were on pace to reach 85% by 2024.¹³

Why Are GPUs so Useful for AI/ML and Generative AI Applications?

CPUs typically have dozens or at most a few hundred compute cores that can perform complex tasks; GPUs have many thousands of simpler compute cores that operate in parallel. The GPU architecture is perfectly suited to the huge number of matrix multiplication tasks and logic layers that lie at the heart of neural networks.

Back to 2006: Researchers in France used Nvidia graphics cards to train their neural networks.⁴ More famous work later occurred at the University of Toronto during 2011–2012 (“AlexNet”).³¹ Nvidia closely followed these developments and invested heavily in software tools and libraries for building deep-learning applications, such as cuDNN (CUDA Deep Neural Network), released in 2014.²²

In 2016, Nvidia introduced its Pascal microarchitecture, targeting the high-performance computing market and datacenters hosting ML/AI and other compute-intensive applications. Now, Nvidia was able to sell rack severs costing tens of thousands of dollars, not just PC graphics cards. Nvidia priced its top-end DGX-1 server at $129,000 and even marketed this as an “AI supercomputer in a box.” To stimulate the applications ecosystem, Nvidia donated several servers to universities as well as to OpenAI, then organized as a non-profit research laboratory.³¹ OpenAI would go on to partner with Microsoft in 2019 and introduce ChatGPT in November 2022. Overall, since 2017, when Google’s work on language transformers grabbed the attention of the AI/ML community, Nvidia has invested aggressively in optimizing its GPUs and CUDA software for LLMs and inference engines.³

In 2022, Nvidia released its latest Hopper microarchitecture (named for programming pioneer Grace Hopper). The new GH200 systems include more CPU-like capabilities as well as thousands of compute cores and staggering amounts of memory, all meant to “supercharge” generative AI applications.¹¹

What Has Kept Nvidia’s Sales and Profits so High?

Demand for Nvidia GPUs over the past several years has exceeded supply, leading to high GPU prices and profits, even though recent shipments have slowed.¹⁴ U.S. government restrictions on exports of advanced technology also may reduce future revenues, especially since China accounts for 20% to 25% of Nvidia’s datacenter sales.²³ Nonetheless, as of late 2023, Nvidia claimed an installed base of more than 500 million GPUs, with thousands of CUDA-based applications.¹⁵ The company’s H100 processors, introduced in 2022, cost approximately $40,000 each and are essential purchases for datacenters, which represent a trillion-dollar market.⁵^,²⁰

Network effects between Nvidia’s GPU platform and third-party applications also create a kind of flywheel, fueling demand. The growing installed base of Nvidia hardware, particularly in datacenters such as Amazon Web Services (AWS), Microsoft Azure, and Google Cloud, enables more developers to build more CUDA-based applications. CUDA software only runs on Nvidia GPUs (unless there is a virtual machine emulation layer, which degrades performance). Rising application usage, such as for training LLMs and running inference engines built with CUDA software, requires more Nvidia hardware. The positive feedback loops resemble what Intel and Microsoft achieved with Intel x86 microprocessors and free Microsoft SDKs paired with PCs running DOS and then Windows.³⁴ In this case, Nvidia dominates both the hardware and software sides of the platform.

What Competition Does Nvidia Face in Hardware?

AMD has targeted Nvidia’s H100 with its MI300X line, specifically designed for generative AI. AMD GPUs may already be slightly ahead of Nvidia in terms of price-performance, but only for gaming.²⁹ Intel in 2019 acquired Israel’s Habana Labs for $2 billion and then in 2022 introduced the Gaudi2 chip, which targets Nvidia’s H100 as well. Intel’s product line does particularly well in inference processing.⁶^,¹² Several startups, led by SambaNova and Cerebras, also have raised billions of dollars to design new generations of GPU platforms.²

Cloud service providers have been building their own systems to reduce their GPU purchases. Google introduced its famous Tensor Processing Units (TPUs) for in-house use in 2016 and then third-party use in 2018. These have relatively limited software compared to CUDA and require Google Cloud.²⁸ However, Google TPUs and its JAX AI library, introduced in 2018, reportedly outperform Nvidia systems in some applications.³^,¹⁰ AWS introduced its Trainium machine-learning accelerator in 2020, optimized for deep-learning training, with some software support.¹⁹ Microsoft intended to release a custom AI chip for its datacenters in late 2023.³⁰ Meta/Facebook also has an in-house GPU and supercomputer effort under way.³²

What Competition Does Nvidia Face in Software?

Software is the “moat” that keeps users from switching away from Nvidia hardware, with some 250 CUDA libraries widely used by GPU programmers.⁷ Still, Nvidia has vulnerable spots. Some programmers complain CUDA is proprietary and not open source (cannot access and modify the source code) as well as difficult to use if you are not familiar with C/C++. Nvidia has recently introduced support for more popular languages, including Python (PyCuda).¹⁷^,²¹ Of course, programmers can use other languages and avoid CUDA entirely, though they would have to recreate all the CUDA drivers, libraries, and other tools, and they lose direct access to the Nvidia GPU instruction sets.

A major weakness with AMD and Intel has been their limited GPU software support.³³ As a competitive move, both companies have made drivers and libraries open source. Cooperation with the open-source community should help AMD and Intel evolve their software assets faster, but this will still take years.²⁶

Other open-source frameworks exist for GPU programming, such as OpenCL, introduced in 2009 and based on C.²⁴ New languages include OpenAI’s Triton, introduced in 2021 and based on Python.²⁷ Triton seems to work especially well with PyTorch 2.0, an open-source machine-learning library used to train deep-learning models, originally developed at Meta/Facebook.²⁵ Triton still requires a CUDA compiler, but it avoids CUDA propriety libraries in favor of open-source alternatives. Future versions should run on Intel, AMD, and other GPU hardware.¹⁸

Conclusion

Nvidia is at the center of the generative AI ecosystem and is likely to remain there for several years. However, competitors (for example, AMD and Intel) and users (for example, data centers and the open-source community) are actively developing or exploring alternatives. If Nvidia GPUs remain scarce and expensive, users will find substitutes or ways around Nvidia’s proprietary software. Datacenters also may turn to cheaper hardware, including CPUs, to host less-demanding generative AI software, such as smaller, focused LLMs and inference engines dedicated to specific tasks.³⁵

Submit an Article to CACM

CACM welcomes unsolicited submissions on topics of relevance and value to the computing community.

You Just Read

Nvidia at the Center of the Generative AI Ecosystem—For Now

View in the ACM Digital Library

DOI

10.1145/3631537

January 2024 Issue

Vol. 67 No. 1

Pages: 33-35

Table of Contents

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

The Latest from CACM

Explore More

BLOG@CACM Apr 26 2024

Optimizing Energy Efficiency in Datacenters with Advanced Cooling Technologies

Alex Williams

Architecture and Hardware

Credit: Getty Images Servers in snowy setting.

News Apr 23 2024

Maximizing Power Grid Security

R. Colin Johnson

Security and Privacy

News Apr 18 2024

Keeping AI Out of Elections

Bennie Mols

Artificial Intelligence and Machine Learning

Shape the Future of Computing

ACM encourages its members to take a direct hand in shaping the future of the association. There are more ways than ever to get involved.

Get Involved

Communications of the ACM (CACM) is now a fully Open Access publication.

By opening CACM to the world, we hope to increase engagement among the broader computer science community and encourage non-members to discover the rich resources ACM has to offer.

Learn More

Why Has Nvidia Dominated the GPU Market?

Why Are GPUs so Useful for AI/ML and Generative AI Applications?

What Has Kept Nvidia’s Sales and Profits so High?

What Competition Does Nvidia Face in Hardware?

What Competition Does Nvidia Face in Software?

Conclusion

Nvidia at the Center of the Generative AI Ecosystem—For Now

DOI

January 2024 Issue

Related Reading

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

Shape the Future of Computing

Communications of the ACM (CACM) is now a fully Open Access publication.