Azure Instances Pair AMD's MI300X With Intel's Sapphire Rapids

Microsoft’s new AI-focused Azure servers are powered by AMD’s MI300X datacenter GPUs, but are paired with Intel’s Xeon Sapphire Rapids CPUs. AMD’s flagship fourth-generation EPYC Genoa CPUs are powerful, but Sapphire Rapids appears to have a couple of key advantages when it comes to pushing along AI compute GPUs. It’s not just Microsoft choosing Sapphire Rapids either, as Nvidia also seems to prefer it over AMD’s current-generation EPYC chips.

There are likely several factors that convinced Microsoft to go with Intel’s Sapphire Rapids instead of AMD’s Genoa, but Intel’s support for its Advanced Matrix Extensions (or AMX) instructions could be among the important reasons Microsoft tapped Sapphire Rapids. According to Intel, these instructions are tailored towards accelerating AI and machine learning tasks by up to seven times.

While Sapphire Rapids isn’t particularly efficient and has worse multi-threaded performance than Genoa, its single-threaded performance is quite good for some workloads. This isn’t something that only helps AI workloads specifically; it’s just an overall advantage in some types of compute.

It’s also worth noting that servers using Nvidia’s datacenter-class GPUs also go with Sapphire Rapids, including Nvidia’s own DGX H100 systems. Nvidia’s CEO Jensen Huang said the “excellent single-threaded performance” of Sapphire Rapids was a specific reason why he wanted Intel’s CPUs for the DGX H100 rather than AMD’s.

1 Like

I suspect the biggest reason for SR (Sapphire Rapids) over Genoa is cost. Intel was having a hard time filling the factories, and SR is a great fab filler.

Microsofts comments on the latest ROCm software for the MI300 is very encouraging for AMD.