Thought it would be useful to have a thread on the DeepSeek R1 model which is having a big downwards impact on AI hardware companies such as Nvidia, Astera Labs, and Credo.
The DeepSeek model is Chinese made, open source, and costs less than OpenAI or Meta’s Llama models. They have a phone app and web app. One of the most fascinating things about the app is that it shows the chain of thought reasoning for the query as it goes. Here’s a query to DeepSeek asking about Raspberry Pi. Notice how it uses phrases such as “Wait,”, “I remember”, or “Maybe?”
The DeepSeek R1 model performs strongly against OpenAI’s latest public model o1, in most cases being on par with o1,
Interestingly, the R1 model was not released this weekend, but it seems the market woke up today to threat this model may mean for companies like Meta which is producing Llama. This article talks about how they are scrambling now at Meta to unravel and understand how the R1 model is outperforming them at a lower cost,
Here’s what a supposed insider at Meta said about it,
Overall this does seem like a threat to the model makers and makes me wonder if companies like Meta may begin to slow down hardware purchases. Otherwise, they may be able to able to copy the insights from DeepSeek and then apply even more compute to it. Will be interesting to see how this plays out.
The market reaction seems pretty swift for downgrading these companies. It could present some opportunities as well if the prices really crash on these AI hardware makers. Would be interested to hear other board member’s take on the situation here.