Astera Labs recently had discussions and presentations at three separate conferences in early December with UBS, Barclays, and Raymond James.
There was quite a lot of new information in these short presentations, including two of the bigger announcements,
- Astera replaced Broadcom in the UALink (Ultra Accelerator Link) consortium which is a competing standard against Nvidia’s NVLink. This group includes Microsoft, Google, Amazon, and AMD among others. Astera noted that the hyperscalers almost never join these type of open standards groups, but this shows how critical the emerging standards are that are coming
- Astera plays a key role in Amazon’s new Trainium-2 chip, and mentions each chip has 128 of our PCIe-based AECs. They’ve mentioned repeatedly that hyperscalers are “continuing to double up on their own ASICs and accelerators that they are building”. Astera says they “have close to 400 design wins that are for the internal accelerator based platforms”
On the UALink group they said this,
Here is a less than a minute intro to the AWS Tranium-2 chip,
In the middle of the video they say they are deploying the chip with a “Petabit scale network fabric” which I believe is Astera’s Scorpio. Some shipments of these chips are already over 100,000 chips in size connected and being installed in their data centers.
From the conferences,
Some other notes from the conferences include,
- hyperscalers starting to become more vertically integrated and doing their own chips, lots of announcements from them
- pure play AI company, more than 80% of business comes through AI deployments through Nvidia, AMD, and internally developed ASICs that hyper scalers are developing
- GPUs today are only 50% utilized, customers writing big checks to Nvidia but half the time the chips is collecting dust, Astera solving this problem and lots of opportunity to grow the business
- developing products like our fabric devices that are becoming more central to the AI deployment
- higher dollar content opportunities per GPU, dollar content is growing significantly generation over generation
- on Nvidia systems only Astera only plays in the front-end network, GPU → CPU or storage and networking, while the backend where NVLink works Astera doesn’t play here
- non-Nvidia systems they play on both sides of front end and back end, such as in the Trainium-2 chip
- focus on four protocols: UALink GPU to GPU, Ethernet, CXL for memory, and PCI Express for interconnecting storage and networking
- Scorpio is the industry’s first fabric device that is developed for AI interconnects
- greenfield use case in the non-NVIDIA ecosystem where Scorpio X-series
- comparing to competitors Astera offers a module form factor, meaning it’s not just the chip
- Taurus line which competes with Credo does the work as a chip while Credo built a complete cable, “Credo obviously did a good job, recognizes market, and their approach was to be a cable - a complete cable”
- ASP tends to be different (comparing to Credo), “so you’ll see that reflected in what Credo announces and what we announced. But generally our business is more profitable”, approach is much more scalable and portable
- Leo CXL product line is being deployed on the general compute side for large database applications, have all four major hyperscalers in the US developing CXL based platforms right now
- “We are also seeing a lot of inference use cases, where CXL benefits”
- our chips are software defined meaning 60-70% of the chip is actually implemented in software, the benefit of a software defined architecture is that it is customizable
- systems have become very complicated, need to monitor them in terms of telemetry, diagnostics, fleet management, and predictive failure
- hyperscalers use COSMOS API to monitor their infrastructure, detect failures before they happen
- we don’t discuss unreleased products, not revealing anything right now… “it’s amazing how much energy there is, how much traction and engagement we have as we grow our product lines”
- think of the company as “heterogeneous compute”, connectivity fabric, or connectivity subsystem, nervous system for AI is different
- hyperscalers starting to get more vertically integrated, meaning they are doing their own chips
- have close to 400 design wins that are for internal accelerator based platforms
- “I’ve always said I think investors have underestimated how the hyperscalers have reacted in terms of investing in their own accelerator programs… you saw that in our numbers”
- non-NVIDIA systems all use variants of PCI Express or other standards, “very, very fertile ground”
- our content is getting much richer
- our vision is to own connectivity at the rack level
- as you move to more complex protocols and faster protocols we’ll see an ASP increase
- 12B TAM for products (some of the products they are the far away leader so getting more of the TAM per that category)
- just using Astera systems hyper scalers can get for free going from 52% GPU utilization up to 55-56% by optimizing, CEO says customers point this out to him
- power envelope and physical size envelope to contend with, have to provide more power to compete
- AI servers that land in the data center only work 69% of the time on day one, 31% need some tweaking, that’s how complex the systems are
- “One single chip will have like 8 different temperature sensors. We can detect if a fan stops working in one corner of a chip, a cable is inserted by it’s not fully inserted. Something else is heating up around our chip” (they have the ability to troubleshoot thermal issues)
- GPUs in a cluster act like one GPU, one goes down the data goes back to the previous checkpoint, it takes 45 minutes today meaning each time something goes wrong, 45 minutes of compute is lost
- three point formula for company, 1) listen to customers 2) innovate 3) execution
- Blackwell created a lot of confusion for folks that are not familiar with Astera or sockets like retimers, Hopper generation was more simple to analyze
- Astera in customized version of NVL racks and that is where content from retimers and fabric devices comes from, Amazon customized GB200 servers that were showcasing, which is Astera’s design win
- categorically noted our retimers shipments will bigger in 2025 than 2024
- Scorpio to be at least 10% of overall revenue next year, CXL to get to production next half of year, customized NVL racks to show up in the 2nd half of the year
- CEO, “I think the market will need to learn more about how the systems are configured and how the retimer business is working” (implying here investors/analysts still don’t understand their growth)
- gen5 of PCIe product for retimers, Astera has 90%+ market share
- retimer content in Blackwell does go down, but more than offset when adding Scorpio content, was telling market overall content would go up before Scorpio was released, looking at overall retimer market including ASICs the overall opportunity goes up
- customers actually coming to us and saying we want you to build the fix
- Scorpio, hoped they would be faster than Broadcom which worked out, first in market and worked very hard to build a better product, do expect Broadcom announcements later
- Taurus still niche, have a lead customer
- COSMOS software can update the firmware on the cable without brining down the server
- Leo CXL working with Granite Rapids CPU from Intel, Turin from AMD, equivalent ARM CPUs
- ROI from LEO is very clear, lots of excitement from customers
- Scorpio, market is moving so fast that we identified a sweet spot, and are addressing multiple opportunities
- will maintain margin model of 70%+, analyst suggests Scorpio is a higher margin product
- at some point as data rates go up, will intersect with optics, huge existing running market for optics and mostly in exploration in this category
One of the most impressive aspects about Astera Labs is each time they have an earnings or any sort of presentation, there is new information revealing even more upside to the company. It is simply incredible Astera removed Broadcom from the UALink consortium and is now the clear leader in defining the standards that hyperscalers will use for building out their next generation of chips such as Amazon’s Trainium-2. I do believe that Astera can out innovate Broadcom and has the possibility to reach or surpass the scale of Broadcom which would present huge upside to investors.