A little disappointed that the stock was down on the news. Some of this I gather is because traders are rolling away from AMD and NVidia to non-tech stocks that haven’t been rallying, but is some due to disappointment over the announcement and maybe the timing when these items will ship in volume? FWIW I gather Meta is deploying a bunch of them.
AMD announced the new MI300X GPU, which is targeted at AI and scientific computing. The GPU is being positioned as an alternative to Nvidia’s H100, which is in short supply because of the massive demand for AI computing.
AMD CEO Lisa Su shows off MI300X silicon
“I love this chip, by the way,” said Lisa Su, AMD’s CEO, during a livestreamed presentation to announce new datacenter products, including new 4th Gen Epyc CPUs.
AMD is on track to sample MI300X in the next quarter. The GPU’s production will ramp up in the fourth quarter.
The MI300X is a GPU-only version of the previously announced MI300A supercomputing chip, which includes a CPU and GPU. The MI300A will be in El Capitan, a supercomputer coming next year to the Los Alamos National Laboratory. El Capitan is expected to surpass 2 exaflops of performance.
But to be sure, Nvidia’s H100 has a sound footing in datacenters that AMD may find hard to overcome. Google Cloud last month announced the A3 supercomputer, which has more than 26,000 Nvidia H100 GPUs, and Microsoft is using Nvidia GPUs to run BingGPT, the AI service it has built into its search engine. Oracle and Amazon also offer H100 GPUs through their cloud services. Nvidia’s market cap as of Tuesday stood at around $1 trillion, largely boosted by demand for its AI GPUs.
The MI300X has 153 billion transistors, and is a combination of 12 chiplets made using the 5-nanometer and 6-nanometer processes. AMD replaced three Zen4 CPU chiplets in the MI300A with two additional CDNA-3 GPU chiplets to create the GPU-only MI300X.
…
Su cherry-picked the performance results, but did not speak about the important benchmark – the overall floating point or integer performance results. AMD representatives did not provide performance benchmarks.
But Su said MI300A – which has the CPU and GPU – is eight times faster and five times more efficient compared to the MI250X accelerator, a GPU-only chip that is in the Frontier supercomputer.
AMD crammed more memory and bandwidth to run AI models in memory, which speeds up performance by reducing the stress on I/O. AI models typically pay a performance penalty when moving data back and forth between storage and memory as models change.
“We actually have an advantage for large language models because we can run large models directly in memory. What that does is for the largest models, it actually reduces the number of GPUs you need, significantly speeding up the performance, especially for inference, as well as reducing total cost of ownership,” Su said.
The GPUs can be deployed across standard racks with off-the-shelf hardware with minimal changes to the infrastructure. Nvidia’s H100 can be deployed in standard servers, but the company is pushing specialized AI boxes that combine its AI GPUs with other Nvidia technology such as its Grace Hopper GPUs, BlueField data processing units, or the NVLink interconnect.
AMD’s showcase for the new GPU is an AI training and inference server called the AMD Infinity Architecture Platform, which crams eight MI300X accelerators and 1.5TB of HBM3 memory. The server is based on Open Compute Project specifications, which means it is easy to deploy in standard server infrastructures.
The AMD Infinity Architecture Platform sounds similar to Nvidia’s DGX H100, which has eight H100 GPUs and 640GB of GPU memory, and overall 2TB of memory in a system. The Nvidia system provides 32 petaflops of FP8 performance.
The datacenter AI market is a vast opportunity for AMD, Su said. The market opportunity is about $30 billion this year, and will grow to $150 billion in 2027, she said.
“It’s going to be a lot because there’s just tremendous, tremendous demand,” Su said.