AMD unveils MI300x AI chip as 'generative AI accelerator'

AMD’s Intuition MI300X GPU options a number of GPU “chiplets” plus 192 gigabytes of HBM3 DRAM reminiscence, and 5.2 terabytes per second of reminiscence bandwidth. The corporate stated it’s the solely chip that may deal with massive language fashions of as much as 80 billion parameters in reminiscence.

AMD

Superior Micro Gadgets CEO Lisa Su on Tuesday in San Francisco unveiled a chip that could be a centerpiece within the firm’s technique for synthetic intelligence computing, boasting its huge reminiscence and information throughput for so-called generative AI duties equivalent to massive language fashions.

The Intuition MI300X, because the half is understood, is a follow-on to the beforehand introduced MI300A. The chip can be a mixture of a number of “chiplets,” particular person chips which are joined collectively in a single package deal by shared reminiscence and networking hyperlinks.

Additionally: 5 methods to discover using generative AI at work

Su, onstage for an invite-only viewers on the Fairmont Resort in downtown San Francisco, referred to the half as a “generative AI accelerator,” and stated the GPU chiplets contained in it, a household often called CDNA 3, are “designed particularly for AI and HPC [high-performance computing] workloads.”

The MI300X is a “GPU-only” model of the half. The MI300A is a mix of three Zen4 CPU chiplets with a number of GPU chiplets. However within the MI300X, the CPUs are swapped out for 2 extra CDNA 3 chiplets.

Additionally: Nvidia unveils new type of Ethernet for AI, Grace Hopper ‘Superchip’ in full manufacturing

The MI300X will increase the transistor depend from 146 billion transistors to 153 billion, and the shared DRAM reminiscence is boosted from 128 gigabytes within the MI300A to 192 gigabytes.

The reminiscence bandwidth is boosted from 800 gigabytes per second to five.2 terabytes per second.

“Our use of chiplets on this product could be very, very strategic,” stated Su, due to the power to combine and match totally different sorts of compute, swapping out CPU or GPU.

Su stated the MI300X will provide 2.4 occasions the reminiscence density of Nvidia’s H100 “Hopper” GPU, and 1.6 occasions the reminiscence bandwidth.

AMD

“The generative AI, massive language fashions have modified the panorama,” stated Su. “The necessity for extra compute is rising exponentially, whether or not you are speaking about coaching or about inference.”

To display the necessity for highly effective computing, Sue confirmed the half engaged on what she stated is the preferred massive language mannequin for the time being, the open supply Falcon-40B. Language fashions require extra compute as they’re constructed with higher and higher numbers of what are known as neural community “parameters.” The Falcon-40B consists of 40 billion parameters.

Additionally: GPT-3.5 vs GPT-4: Is ChatGPT Plus value its subscription price?

The MI300X, she stated, is the primary chip that’s highly effective sufficient to run a neural community of that measurement, fully in reminiscence, relatively than having to maneuver information, back-and-forth to and from exterior reminiscence.

Su demonstrated the MI300X making a poem about San Francisco utilizing Falcon-40B.

“A single MI300X can run fashions as much as roughly 80 billion parameters” in reminiscence, she stated.

“If you examine MI300X to the competitors, MI300X presents 2.4 occasions extra reminiscence, and 1.6 occasions extra reminiscence bandwidth, and with all of that extra reminiscence capability, we even have a bonus for big language fashions as a result of we are able to run bigger fashions straight in reminiscence.”

Additionally: How ChatGPT can rewrite and enhance your current code

To have the ability to run the complete mannequin in reminiscence, stated Su, signifies that, “for the most important fashions, that really reduces the variety of GPUs you want, considerably dashing up the efficiency, particularly for inference, in addition to decreasing the entire value of possession.”

“I really like this chip, by the way in which,” enthused Su. “We love this chip.”

AMD AMD

“With MI300X, you may scale back the variety of GPUs, and as mannequin sizes continue to grow, this can change into much more necessary.”

“With extra reminiscence, extra reminiscence bandwidth, and fewer GPUs wanted, we are able to run extra inference jobs per GPU than you would earlier than,” stated Su. That may scale back the entire value of possession for big language fashions, she stated, making the know-how extra accessible.

Additionally: For AI’s ‘iPhone second’, Nvidia unveils a big language mannequin chip

To compete with Nvidia’s DGX programs, Su unveiled a household of AI computer systems, the “AMD Intuition Platform.” The primary occasion of that can mix eight of the MI300X with 1.5 terabytes of HMB3 reminiscence. The server conforms to the {industry} customary Open Compute Platform spec.

“For purchasers, they’ll use all this AI compute functionality in reminiscence in an industry-standard platform that drops proper into their current infrastructure,” stated Su.

AMD

In contrast to MI300X, which is simply a GPU, the present MI300A goes up towards Nvidia’s Grace Hopper combo chip, which makes use of Nvidia’s Grace CPU and its Hopper GPU, which the corporate introduced final month is in full manufacturing.

MI300A is being constructed into the El Capitan supercomputer underneath development on the Division of Vitality’s Lawrence Livermore Nationwide Laboratories, famous Su.

Additionally: How one can use ChatGPT to create an app

The MI300A is being proven as a pattern presently to AMD clients, and the MI300X will start sampling to clients within the third quarter of this yr, stated Su. Each shall be in quantity manufacturing within the fourth quarter, she stated.

You’ll be able to watch a replay of the presentation on the Web site arrange by AMD for the information.

AMD unveils MI300x AI chip as ‘generative AI accelerator’

Leave a Comment Cancel reply