On Xilinx and the AMD software stack


After the Financial Analyst Day presentations last month, we have been mulling the one by Victor Peng, formerly chief executive officer at Xilinx and now president of the Adaptive and Embedded Computing Group at AMD.

This group mixes together embedded CPUs and GPUs from AMD with the Xilinx FPGAs and has over 6,000 customers. It brought in a combined $3.2 billion in 2021 and is on track to grow by 22 percent or so this year to reach $3.9 billion or so; importantly Xilinx had total addressable market of about $33 billion for 2025, but with the combination of AMD and Xilinx, the TAM has expanded to $105 billion for AECG. Of that, $13 billion is from the datacenter market that Xilinx has been trying to cater to, $33 billion is from embedded systems of various kinds (factories, weapons, and such), $27 billion is from the automotive sector (Lidar, Radar, cameras, automated parking, the list goes on and on), and $32 billion is from the communications sector (with 5G base stations being the important workload). This is roughly a third of the $304 billion TAM for 2025 of the new and improved AMD, by the way. (You can see how this TAM has exploded in the past five years here. It’s remarkable, and hence we remarked upon it in great detail.)

But a TAM is not a revenue stream, just a giant glacier off in the distance that can be melted with brilliance to make one.

Central to the strategy is AMD’s pursuit of what Peng called “pervasive AI,” and that means using a mix of CPUs, GPUs, and FPGAs to address this exploding market. What it also means is leveraging the work that AMD has done designing exascale systems in conjunction with Hewlett Packard Enterprise and some of the major HPC centers of the world to continue to flesh out an HPC stack. AMD will need both if it hopes to compete with Nvidia and to keep Intel at bay. CUDA is a formidable platform, and oneAPI could be if Intel keeps at it.

“When I was with Xilinx, I never said that adaptive computing was the end all, be all of computing,” Peng explained in his keynote address. “A CPU is going to always be driving a lot of the workloads, as will GPUs. But I’ve always said that in a world of change, adaptability is really an incredibly valuable attribute. Change is happening everywhere you hear about it, the architecture of a datacenter is changing. The platform of cars is totally changing. Industrial is changing. There is change everywhere. And if hardware is adaptable, then that means not only can you change it after it’s been manufactured, but you can change it even when it’s deployed in the field.”

Well, the same can be said of software, which follows hardware of course. Even though Peng didn’t say that. People were messing around with SmallTalk back in the late 1980s and early 1990s after it had been maturing for two decades because of the object oriented nature of the programming, but the market chose what we would argue was an inferior Java only a few years later because of its absolute portability thanks to the Java Virtual Machine. Companies not only want to have the options of lots of different hardware, tuned specifically for situations and workloads, but they want the ability to have code be portable across those scenarios.

This is why Nvidia needs a CPU that can run CUDA (we know how weird that sounds), and why Intel is creating oneAPI and anointing Data Parallel C++ with SYCL as its Esperanto across CPUs, GPUs, FPGAs, NNPs, and whatever else it comes up with.

This is also why AMD needed Xilinx. AMD has plenty of engineers – well, north of 16,000 of them now – and many of them are writing software. But as Jensen Huang, co-founder and chief executive officer of Nvidia explained to us last November, three quarters of Nvidia’s 22,500 employees are writing software. And it shows in the breadth and depth of the development tools, algorithms, frameworks, middleware available for CUDA – and how that variant of GPU acceleration has become the de facto standard for thousands of applications. If AMD s going to have the algorithmic and industry expertise to port applications to a combined ROCm and Vitis stack, and do it in less time than Nvidia took, it needed to buy that industry expertise.

That is why Xilinx cost AMD $49 billion. And it is also why AMD is going to have to invest much more heavily in software developers than it has in the past, and why the Heterogeneous Interface for Portability, or HIP, API, which is a CUDA-like API that allows for runtimes to target a variety of CPUs as well as Nvidia and AMD GPUs, is such a key component of ROCm. It gets AMD going a lot faster on taking on CUDA applications for its GPU hardware.

But in the long run, AMD needs to have a complete stack of its own covering all of the AI use cases across its many devices…

That stack has been evolving, and Peng will be steering it from here on our with the help of some of those HPC centers that have tapped AMD CPUs and GPUs as their compute engines in pre-exascale and exascale class supercomputers.


This is very consistent with what INTC has been saying for several years now. Bob Swan (the old INTC CEO) kept yammering on about his “software first” strategy. Of course it cost Intel dearly in hardware, but perhaps it was the correct long term plan. We will see…