AI Chip Deficit – Alternatives to Nvidia GPUs

HodlX Guest Post Submit Your Post

In January 2024, leading private equity firm Blackstone announced it was building a $25 billion AI data empire.

A few months later, OpenAI and Microsoft followed suit with a proposition to build Stargate, a $100 billion AI supercomputer that will launch the company to the forefront of the AI revolution.

Of course, this is not a surprise. With the rapid acceleration the AI sector has witnessed over the past few years, industry giants all over the world are in a frantic haste to get front row seats.

Experts already predict the global AI market will hit a massive $827 billion in volume by 2030, with an annual growth rate of 29%.

The only problem? GPUs.

Von Neumann’s architecture, the design model that most general computers operate on – composed of the CPU, memory, I/O devices and system bus – is inherently limited even though it offers simplicity and cross-system compatibility.

The single ‘system bus’ of this architecture restricts the speed at which data can be transferred between memory and the CPU – thus, making CPUs less than optimal for AI and machine learning purposes.

This is where the GPUs (graphics processing units) come in.

By incorporating parallelism as a processing technique, GPUs offer improved performance and independent instruction execution through their multi-cores.

However, with the dawn of AI technology, the demand for GPUs has skyrocketed, straining supply chains and posing a severe bottleneck to the efforts of many researchers and startups.

This is especially true since the world’s supply of GPUs comes from just one major producer – Nvidia.

While hyper-scalers like AWS, Google Cloud Platform and others may be able to easily access A100s and H100s from Nvidia, what are other viable alternatives that can help firms, researchers and startups latch onto the AI train instead of being stuck indefinitely on the Nvidia waitlist?

Field programmable gate arrays

FPGAs (field programmable gate arrays) are reprogrammable, integrated circuits that can be configured to serve specific tasks and application needs.

They offer flexibility, can be adapted to meet varying requirements and are cost-effective.

Since FPGAs are efficient at parallel processing, they are well-suited to AI and machine learning uses and possess distinctively low latency in real-life applications.

An interesting implementation of FPGAs can be seen in the Tesla D1 Dojo chip, which the company released in 2021 to train computer vision models for self-driving cars.

A few drawbacks to FPGAs, however, include the high engineering expertise required to architect the hardware, which can translate into expensive initial acquisition costs.

AMD GPUs

In 2023, companies like Meta, Oracle and Microsoft signaled their interest in AMD GPUs as a more cost-effective solution and a way to avoid a potential vendor lock-in with dominant Nvidia.

AMD’s Instinct MI300 series, for example, is considered a viable alternative for scientific computing and AI uses.

Its GCN (graphics core next) architecture, which emphasizes modularity and support for open standards, plus its more affordable price point, make it a promising alternative to Nvidia GPUs.

Tensor processing units

TPUs (tensor processing units) are ASICs (application-specific integrated circuits) programmed to perform machine-learning tasks.

A brainchild of Google, TPUs rely on a domain-specific architecture to run neural networks, such as tensor operations.

They also have the advantage of energy efficiency and optimized performance, making them an affordable alternative for scaling and managing costs.

It should be noted, however, that the TPU ecosystem is still emerging, and the current availability is limited to the Google Cloud Platform.

Decentralized marketplaces

Decentralized marketplaces are also trying to mitigate the constricted GPU supply train in their own way.

By capitalizing on idle GPU resources from legacy data centers, academic institutions and even individuals, these marketplaces provide researchers, startups and other institutions with enough GPU resources to run their projects.

Many of these marketplaces offer consumer-grade GPUs that can sufficiently handle the needs of small to medium AI/ML companies, thus reducing the pressure on high-end professional GPUs.

Some marketplaces also provide additional options for clients who also want industrial-grade GPUs.

CPUs

CPUs (central processing units) are often considered the underdogs for AI purposes due to their limited throughput and the Von Neumann bottleneck.

However, there are ongoing efforts to figure out how to run more AI-efficient algorithms on CPUs.

These include allocating specific workloads to the CPU, like simple NLP models and algorithms that perform complex statistical computations.

While this may not be a one-size-fits-all solution, it is perfect for algorithms that are hard to run in parallel, such as recurrent neural networks or recommender systems for training and inference.

Rounding up

The scarcity of GPUs for AI purposes may not be going away anytime soon, but there is a bit of good news.

The ongoing innovations in AI chip technology attest to an exciting future full of possibilities that will one day ensure the GPU problem fades into the background.

A lot of potential remains to be harnessed in the AI sector, and we might just be standing on the precipices of the most significant technology revolution known to humanity.

Daniel Keller is the CEO of InFlux Technologies and has more than 25 years of IT experience in technology, healthcare and nonprofit/charity works. He successfully manages infrastructure, bridges operational gaps and effectively deploys technological projects. An entrepreneur, investor and disruptive technology advocate, Daniel has an ethos that resonates with many on the Flux Web 3.0 team – “for the people, by the people” – and is deeply involved with projects that are uplifting to humanity.

Check Latest Headlines on HodlX

Check out the Latest Industry Announcements