DeepSeek model impact on AI hardware companies

Wall Street appears to be getting a couple of things wrong. However, the real impact is yet to be determined and it does look like DeepSeek will have a real impact.

The startup spent just $5.5 million on training DeepSeek V3—a figure that starkly contrasts with the billions typically invested by its competitors.

This conflates training costs with infrastructure purchase/set-up costs. It’s like saying “I drove this Chinese sports car that’s faster and redder than a $250k Ferrari cross-country, but for only $10,000.” The training cost is mostly/basically the cost of time rental for the servers involved. Yeah, that Ferrari costs $250k, but you can rent them for $1800/day, so that 10 day cross-country trip comparison is $18,000. Certainly $10,000 is better than $18,000, but it’s not the orders of magnitude analysts and article authors are writing. (Numbers I’m quoting aren’t scaled to the GPU rental numbers, btw, this is just illustrative).

What do cheaper AI deployments mean? It could mean that the push for 100k servers in a data center goes away. That could mean that the data center power companies (POWL, VRT, etc.) are hurt, and that the advanced data center networking companies (Broadcom, ANET, ALAB) are hurt. But the networking companies have other businesses as well.

For Nvidia, it could mean that instead of a dozen companies buying 100k GPUs there are thousands of companies buying 10k GPUs (or something along those lines). I agree with @WillO2028 ‘s take on Jevons’ Paradox - cheaper AI means more use of AI by more players.

However, what we don’t know is how the DeepSeek team’s use of lesser GPUs translates to a world where not just faster GPUs are available, but those faster GPUs also having better price/compute ratios and better power/compute ratios. Despite the higher price of Blackwell over Hopper, the cost per GPU compute and the power needed for that GPU compute unit are better (cheaper) than Hopper. And since in no way does DeepSeek train on less than thousands of GPUs (they claim 2,048 less-than-H100 GPUs, but that’s not been verified), there would still be an advantage to using Nvidia’s latest and greatest to simplify the data center build out.

OTOH, being able to use lesser GPUs opens up the potential market for companies like AMD, and makes the ASICs being developed by Amazon and Google, etc. potentially more viable. Amazon in particular had been going down a low-cost data center route previously, and now if the DeepSeek software advancements can be applied there, Amazon (and Anthropic) might be in a good place.

Additionally, AI software companies (like PLTR) should be better now, as they can improve their models to make their services less costly for their customers to run.

So, the across-the-board AI blood-bath seems clearly overdone to me, but there’s still some questions to be answered before we know who all the winners and losers are.

51 Likes