The hole in NVIDIA's armor?

Long and complicated post here from Dylan Patel on his SemiAnalysis blog, and well worth reading if you’re an NVIDIA investor.

Boiling it down, here are the essential points:

  • CUDA is NVIDIA’s software framework for writing software to use its parallel GPU architecture
  • CUDA, apparently, is a bit of a challenge for software developers
  • That has prompted software developers to build open source solutions that do the work of CUDA – but, crucially, make it both easier for programmers to write effective code for GPUs AND make it easier to use that software on other GPUs besides NVIDIA’s GPUs
  • In particular, an open source project from Meta (PyTorch) and another from OpenAI (Triton) have apparently cracked open the NVIDIA/CUDA armor.

NVDA already trades at a modest premium to the market (forward PE of 25) for the incredible growth it has seen, so this news may already be priced in.

At any rate, I continue to expect NVIDIA to do quite well, and to be able to sell everything they make at a premium for some time still. However, the opportunity is now there for the “fast followers” to grab as well. It’s a huge market opportunity, and there can be a number of winners. I’m confident that NVDA will continue to be a great stock to hold in 2024 – but I’m also glad I bought into AMD in 2023.

BTW, there’s also an excellent discussion in Dylan Patel’s post of the role that memory plays in AI systems (DRAM is now half the cost of an AI server), which is relevant for PSTG fans.


I hesitate to get into the weeds here. I’m trying to focus on the structure of the argument and not the details.

The article proclaims PyTorch the winner “disrupting” CUDA without giving any stats or justification for why this is the premise for the entire article?

What’s in parentheses below is mine.

Ease of use is king (I agree.)

“The only way to break the vicious cycle is for the software that runs models on Nvidia GPUs to transfer seamlessly to other hardware with as little effort as possible. As model architectures stabilize and abstractions from PyTorch 2.0, OpenAI Triton, and MLOps firms such as MosaicML become the default, the architecture and economics of the chip solution starts to become the biggest driver of the purchase rather than the ease of use afforded to it by Nvidia’s (less?) superior software.”

“The performance improvements from PyTorch 2.0 (compared prior iterations?) will be larger for currently unoptimized hardware. Meta and other firms’ heavy contribution to PyTorch stems from the fact that they want to make it easier to achieve higher FLOPS utilization with less effort on their multi-billion-dollar training clusters made of GPUs. They are also motivated to make their software stacks more portable to other hardware to introduce competition to the machine learning space.”

Furthermore, this article ends with…

The rest of this report will point out the specific hardware accelerator that has a huge win at Microsoft, as well as multiple companies’ hardware that is quickly being integrated into the PyTorch 2.0/OpenAI Trion software stack. Furthermore, it will share the opposing view as a defense of Nvidia’s moat/strength in the AI training market.

Sounds like more reasons to question the initial presumption. I really didn’t see a reason to pay to read more

  • In particular, an open source project from Meta (PyTorch) and another from OpenAI (Triton) have apparently cracked open the NVIDIA/CUDA armor.

In Nvidia official conferences they are talking about the benefit of Triton and have multiple presentations on how to setup Triton to work with Nvidia chips. I’m still new to Nvidia but I just assumed Triton was a Nvidia owned product because of how they were talking about it glowingly.

Either I’m confused what’s going on here, or the author is mistakenly thinking Triton is purely competitive and not offering some benefit to Nvidia too?

Excerpt From the December 14 Nvidia special call,

So we’ll look at a couple of inference – the key inference software from the NVIDIA inference platform before we dive into the actual use cases in the financial services. The first one is a Triton Inference Server. Triton Inference Server is an inference serving software for fast, scalable and simplified inference serving. The way it achieves all of that is by doing all these things that you see here in this chart, starting from support for any framework.

So regardless of whether it’s the machine learning or deep learning model, it supports all the popular frameworks, like TensorFlow, PyTorch, XGBoost; and then intermediate formats like ONNX; and inference frameworks like TensorRT, even basic Python and more. By doing this, it allows our data scientists to choose whatever framework they need to develop and train the models and then helps in production by streamlining the model execution across these frameworks. It also supports multi-GPU, multi-node execution of inference, of large language models.

The second benefit of Triton is it can handle different types of processing of the model, whether it is real-time processing or off-line batch or accept – it accepts video or audio as an input and has a streaming input. And also it supports pipeline. Because today, if you look at any AI application, any actual AI pipeline, it’s not a single model that works. And we have preprocessing, we have postprocessing, and there are many models that actually work in sequence or some in parallel for specific inference. And so it supports that pipeline.

The third benefit is Triton can be used to run models on any platform. It supports CPUs, GPUs. It runs on various operating systems and of course on the cloud, on-prem, edge and embedded. So essentially, it provides us a standardized way to deploy, run and scale AI models.

And it works with many DevOps and MLOps tools, like Kubernetes, KServe MLOps, platforms on the cloud and on-prem. And this is how it’s able to scale the models based on demand.

It’s able to offer all of these benefits without leaving any performance on the table. It provides the best performance on GPUs and CPUs; and it has unique capabilities like dynamic batching, concurrent execution. And thereby, it not only provides very high throughput and – with low latency, it also increases the utilization of the GPU. So essentially maximizing the investment, maximizing the ROI from those compute resources.


DRAM content in AI systems is relevant to folks who supply DRAM chips such as Micron and Samsung. It is completely irrelevant to PSTG fans, for which I am one.


@WillO2028 @wpr101 let me try to explain without going too deep into the weeds. I’m not an expert in AI so I would welcome anyone with more expertise to chime in.

Generative AI developers are writing their
software to use PyTorch and Triton rather than just writing to CUDA.

“… the research community and most large tech firms have settled around PyTorch. This is exemplified by the fact that nearly ever generative AI model that made the news, being based on PyTorch.”

PyTorch and Triton work well with CUDA and with NVIDIA hardware, which is why you hear NVIDIA talk about them glowingly. Nevertheless they also work with other hardware and can readily be extended to support hardware from new entrants.

That’s the danger to NVIDIA. As with IBM PCs and MS-DOS, the software mediates and hides the hardware (and other, lower level software like CUDA), enabling other hardware (and software) to be substituted (like Dell or Compaq PCs).

Dylan Patel’s post is about a year old so there are undoubtedly new developments. I hope we can get some more current perspective.



I disagree. I don’t think it changes anything with regard to NVDA’s moat.

PyTorch is what people called a “high level framework” that interacts with “low level frameworks” like CUDA. It is meant to let developers use full power of GPUs or CPUs of their choosing without having to know how to optimize code so they can focus on developing their model fast without knowing how to wire stuff under the hood. A lot of the PyTorch code can be optimized for nVIDIA GPU with one or two lines of code. There are even higher level frameworks above PyTorch like Huggingface.

The same is with Triton. In general, most people don’t want to learn CUDA. This would be similar to saying “you must know how to write C or Assembly to program” and that’s what the higher level frameworks aim to solve.

NVDA works closely with the PyTorch people and developers for other high level frameworks so they can use the full capacity of the new GPUs. The lead NVDA has on such framework is enormous. For example, AMD has nothing like that. Intel has tried to promote OpenVINO but I haven’t heard of anything serious out of using it. Of course they will try to eat the compute share but so far no one is even close.

P.S. Tensorflow lost out to PyTorch because it sucks to use. Not because of eager execution like claimed in the article. Google also abandoned it in favor of JAX a few years, which accelerated its death. Not sure where the information from the blog post came from.


You guys are focusing on the technology while ignoring the economics.

About PyTorch

It is free and open-source software released under the modified BSD license.

The problem with open-source is that it is very difficult to make a profit on it. Its usefulness is what it enables, in this case, using NVDIA hardware, no wonder NVDIA loves it.

For historical reference, MySQL is (was?) a great database platform offered in both open-source and paid versions. The owners of the platform though of going public but ended up selling out to Oracle. The paid version simply did not pay enough.

Denny Schlesinger

Wishing you all a Prosperous New Year 2024


@chang88 Strategically, there’s a big difference between
Previous architecture: Developer —> CUDA → NVIDIA GPU and
Current architecture: Developer → high level open source middleware → CUDA → NVIDIA GPU.

The alternative architecture – Developer → high level open source middleware → low level open source middleware → alternative GPU – may be deficient at the moment in comparison to what can be achieved with the NVIDIA/open source architecture but the “field of battle”, if you will, is now shifting to the middleware, and that’s an enormous change.

Programmers everywhere, especially at research universities and companies like META, now have an incentive to improve the open source middleware solutions on the alternative path, since it can enable them to access more affordable GPU systems, and will drive down their cost of ownership. In addition, since NVIDIA currently has orders for more chips than it can make, there’s an incentive just based on availability alone. [note to @captainccs: it is most certainly about economics, as well as technology!]

I am not arguing that NVIDIA is in trouble for 2024, which is to say, the timeframe that matters to most people on this board. In fact, I think they’ll probably still merit a premium to the market through 2025 at least.

However, many of us have been wondering how long can they continue to be the “monopoly provider”, effectively, of the infrastructure to support generative AI. The thesis of this Dylan Patel post, which I find persuasive, is that we can foresee the day when there are in fact viable alternatives that are in widespread use.

I’m still an enthusiastic shareholder of NVDA, and expect to be one for many years to come. That doesn’t mean I have to see their business with an uncritical eye.



While I agree that it’s inevitable that CUDA will not remain a high-walled moat for Nvidia, making programming AI applications easier increases the TAM for AI, which helps Nvidia more than it loses from software lock-in. This is especially true as in the big picture, there is far more AI software that needs to be written than there is AI software that people want to port to other hardware.

That’s why Huang and Nvidia are embracing PyTorch and Triton. The easier it is to create AI applications, the more AI applications there will be, and those new AI applications will drive growth in the hardware needed to support AI. And, Nvidia inarguably makes the best and fastest AI hardware in the world. And not just compute, but networking and other aspects that are also needed.

The faster the AI market grows, the more Nvidia can lose some market share and yet still grow sales at a high pace. And if you really think about it, Nvidia is unlikely to retain a 95% (or whatever) dominance in AI hardware no matter what. Competition like AMD is too smart and resourceful for that. Nvidia is better served with a 75% dominance in a market that has grown 400%.

So, developers, go ahead and write in PyTorch. When running on Nvidia hardware, it’ll make the CUDA calls for you and run faster than on anyone else’s hardware. Now, tell me why you want to port to someone else? You want more lag in response to a query to generate that image or write that paper for you?


All good points, @Smorgasbord1 – the game is still NVIDIA’s to lose. But now it’s a race, and we can see the path for at least one serious contender, whether it’s AMD or someone else.

Having a real competitor doesn’t mean NVIDIA will necessarily lose – and it’s hard to see how they won’t continue to crush it in 2024 and probably well into 2025, at the least.

Beyond that? We’ll see how it goes.



For those of you long NVDA, first my congratulations as you are riding the wave.

Second, I just wanted to remind what happened when a massive event occurred which pulled forward tremendous revenue and what happened to the stocks involved. I am speaking of Zoom and Pfizer. They rode the wave up, up, and up. But at some point the growth stopped (went negative in the case of PFE). Investors that were in at any point during the huge wave up lost all those gains on the way back down.

I know AI is the current “big wave” but does it really translate to outsized growth for years? I will point out there was plenty of reason to believe ZM was going to keep growing too (nobody was going back in the office, etc).

Seems to me, that its a relatively risky time to be buying in. But as caveat, I’ve been 100% wrong on this one (actually more like 1000% wrong).



@rhill0123 when I set up my position in NVDA in Fall 2018, I didn’t know what form the wave would take or when it would arrive. I just believed that the next major advance in computing would come from parallel processing, a technology I’d been introduced to in 1977.

In contrast, when I bought into ZM in Summer 2019, I just knew that their offering was greatly superior to the competition, but the company was far less mature and so my position was much smaller.

As it happened, Zoom took off, but the barriers to entry for competitors turned out to be much lower than I understood, and I started exiting in early 2021. The CEO seemed very smart and they had a ton of cash, so I didn’t fully exit until year end of 2022, when it became apparent to me that they had a long road ahead to turn the business around.

The barriers to competitors are ever so much higher for NVDA than they are for Zoom. In the Zoom case, Microsoft could throw a bunch of programmers at the problem and come up with a solution that, while it’s not better than Zoom’s, is at least decent to use.

A competitor to NVIDIA needs expertise in at least a dozen different major skill areas, and many of them are quite rare skillsets – you can’t just put a “help wanted” ad out there to get the expertise you need.

It’s also worth noting that while Zoom sells direct, NVIDIA provides tools that then get built into useful services and products by others. That’s a much longer and more involved Go To Market process. Where Zoom’s GTM cycle is measured in days, NVIDIA’s is measured in years.

This is why I’ve been on the prowl for potential “holes in NVIDIA’s armor” – to understand what the future might hold and how long it might take before serious competition emerges, and NVDA’s incredible pricing power is lessened. Even then, I fully expect NVDA’s business to remain strong, in the same way that Intel trounced AMD and all other comers for many years before finally losing its way.

So for someone who has been on the sidelines but is now wondering if now is a good time to jump into a position with NVDA, I’d say:

  1. you should be wary of the advice from someone who’s already in – their perspective is likely to be warped by their experience;
  2. that said, I believe that the market for AI technology is “a big open mouth”, as one of my former colleagues would say – an enormous source of demand that will continue for years;
  3. I look at the forward P/E of NVDA today, and it reminds me of the long stretch post-iPhone when Apple’s P/E was well below the market. Can you tell I don’t believe in the “efficient market” hypothesis?



I believe it’s a mistake to use past other companies’ stock performance as some sort of predictor for Nvidia without looking at the underlying business and environment.

I struggle to find a proper comparison with Zoom:

  1. Zoom was going against major established competition from Microsoft and Cisco. Nvidia is already the leader by far in its market.
  2. Zoom rode a short-term wave of pandemic-driven retail customers to its products. From all accounts, including the McKinsey report I linked to in another thread, AI is a long-term sea change that will affect almost every aspect of how businesses are run.
  3. Zoom’s TAM was essentially limited to personal use as the entrenched competition for medium to large businesses had moats and more complete offerings (Office 365) to keep customers from switching.
  4. Zoom hit a high percentage of its TAM (Total Available Market) quickly, thanks to cloud scaling. The TAM for high powered AI chips, on the other hand, is still growing, and will likely continue to grow for at least a couple of years as businesses (and countries) adopt more AI into their processes.
  5. Unlike companies such as Alteryx, Nvidia is aggressively moving into cloud services.

Yes. Read the McKinsey report. The AI wave will easily be a decade long. The real questions are which companies will maintain or gain leadership, and in what areas. There have been comparisons to the spread of the internet, in which hardware companies like Cisco were the initial beneficiaries, and connectivity portal software (AOL, Yahoo) were next. Then database (Oracle), search (Google) and social (MySpace, Facebook), and then eCommerce (Amazon). Then back to hardware with phones and tablets, etc. And so on.

The best argument against Nvidia right now, IMO, is that it’s riding an early hardware aspect of AI, but that the next rising aspect will be something like AI software. Whether that’s some kind of infrastructure play (cloud), or software applications is hard to say right now. I personally would avoid software infrastructure plays, as we’ve seen in the past that the most successful products in software infrastructure all start out as Open Source, and only when they mature is there a possibility of building a profitable business around them - but that’s tough to achieve (look at MDB versus Confluent).

Back to AI, the aspects that see high growth and profits will certainly not continue to be the hardware forever. That said, we are early in this wave and there’s certainly another year or two of hardware growth in AI and no reason to believe Nvidia won’t continue to be dominant. And Nvidia is already moving into new areas, such as cloud services, software consulting, etc. They’re not going to get caught stuck in hardware land like Cisco did.


Hi Smorg,

I so appreciate your contributions here, I hesitate to ask for clarification.
However, when you wrote…

the next rising aspect will be something like AI software. Whether that’s some kind of infrastructure play (cloud), or software applications is hard to say right now. I personally would avoid software infrastructure plays, as we’ve seen in the past that the most successful products in software infrastructure all start out as Open Source, and only when they mature is there a possibility of building a profitable business around them - but that’s tough to achieve (look at MDB versus Confluent).

I’m puzzled a bit here. Are you suggesting that software application companies are somehow safer bets than software infrastructure companies, for investing purposes?

If so, where’s the line separating this as a consideration?

For example, Snowflake provides the platform, Snowpark, and many of the tools to build applications for developers. And, Snowflake charges for consumption of the software applications by the Enterprises utilizing that which is created by developers on the Snowflake platform.

Do I have that right by your understanding?

I do understand that LLMs will collapse much of the consumer facing App Layer; but, I don’t believe this is what you’re saying. Perhaps the opposite?




No, I’m saying that software infrastructure tends to be based on Open Source for the simple reason that developers want free, even when paid options are better (the little-knows example that pops into my mind is that even Tesla chose Linux over QNX). About the only thing as hard as selling something that competes with Open Source is selling something that is based on Open Source (only MDB pops into mind as successful there).

For me that’s both small potatoes and so-today. What I’m talking about is finding the next wave in AI after the hardware picks and shovels growth slows. What’s the AWS for AI? It’s not Nvidia’s GDX that basically makes their HGX hardware available on the cloud. It’s maybe some kind of AI that writes, deploys, and tests AI sofware for you. If I knew what it was, I’d start a company to do that, lol. And beyond that, what’s the next Facebook or iPhone?

But to be clear, just as Cisco and Yahoo were great investments for years, so I believe will Nvidia. I don’t see anything changing in the next 18-24 months to upend Nvidia’s total dominance. Whether it’s little-known India’s Yotta Data Services committing to buy $1B of Nvidia chips by itself or completely-vertically-integrated-and-do-everything-themselves Tesla buying huge amounts of Nvidia chips for its Dojo computers despite announcing their own AI chip years ago, no-one can compete for the best AI hardware against Nvidia. And since we’ve just seen the best that AMD and Intel have to offer via recent announcements, that isn’t changing for at least a chip development cycle (2 years), in my estimation.

I think the next wave might be some kind of software infrastructure to develop and deploy AI applications. As investors, the trick is finding the next AWS (proprietary) and not the next Linux or Apache (Open Source). The good news is we have time to see how things play out at the universities and in the market.