We’re getting into some heavy duty tech weeds here - let me try to uplevel the discussion.
First, the “standards organization” you’re talking about is the Ultra Ethernet Consortium, which isn’t a true standards organization run by ISO or ANSI, it’s actually:
a buddy movie collection of Ethernet ASIC suppliers and switch makers who do not really want to cooperate with each other but who are being compelled by the Internet titans and their new AI upstart competition to figure out a way to not only make Ethernet as good as InfiniBand for AI and HPC networking, but make it stretch to the scale they need to operate.
What unites these companies – Broadcom, Cisco Systems, and Hewlett Packard Enterprise for switch ASICs (and soon Marvell we think), Microsoft and Meta Platforms among the titans, and Cisco, HPE, and Arista Networks among the switch makers – is a common enemy: [Nvidia’s] InfiniBand.
The enemy of my enemy is my ally.
And apparently Meta spent $1B for Arista networking in 2022:
I couldn’t get through the whole first article I linked above - it’s well written but extremely tech dense. Here’s how it concludes:
But if the Ultra Ethernet Consortium has it Meta Platforms’ way, Ethernet will be a lot more like InfiniBand and will have multiple suppliers, thus giving all hyperscalers and cloud builders – and ultimately you – more options and more competitive pressure to reduce prices on networking. Don’t expect it to get much below 10 percent of the cost of a cluster, though – not as long as GPUs stay costly. And ironically, as the cost of GPUs falls, the share of the cluster cost that comes from networking will rise, putting even more pressure on InfiniBand.
It is a very good thing for Nvidia right now that it has such high performance GPUs and higher performance InfiniBand networking. Make hay while that AI sun is shining.
Another view, with commentary and less tech, is here:
https://andoverintel.com/2023/07/27/is-the-ultra-ethernet-initiative-about-infiniband-or-about-the-network-of-the-future/
But, remember, Nvidia has its own fast ethernet products - the Spectrum line. Nvidia’s idea is that full-on AI clusters (think ChatGPT) should use InfiniBand because it’s the fastest; if your data center has a mix of AI and other data center workloads, then Spectrum is Nvidia’s ethernet offering.
I haven’t read anything to suggest that Ultra Ethernet will surpass InfiniBand in speed or latency, just flexibility and compatibility. And so my own view is that while vendors want customers to buy ethernet compatible products, people building large AI clusters (the people buying as many Nvidia H100s as they can get their hands on) almost certainly don’t want networking to be the bottleneck, and Nvidia makes InfiniBand work natively with its chips. If you’re building a multi-tenant cloud, sure, you probably want ethernet, but hey, Nvidia offers that as well.
BTW, I do think Arista has great engineering. I owned shares back in the day. I haven’t looked at the business side of thing in years, though.