Long Form Interview with Nvidia CEO Jensen Huang

Definitely worth a watch for potential and current Nvidia investors.

For instance, take the section where they discuss the impact of Inferencing today.

Huang points out that you want your fastest compute to train your models. But, as you buy more and more compute, you end up with some older compute that’s slower than the new stuff that’s come out. That old compute is great for inferencing since Nvidia ensures compatibility (through CUDA) of its line as it develops new processors. This appears to be one reason why Blackwell isn’t Osborning Hopper. But, another reason is that CUDA gets better, so the older Hoppers will actually perform better next year than they did this year!

So the concern that some have had about Nvidia not leading in inferencing may be incorrect. Buy the chips you need for training today, and when you buy better chips tomorrow, re-purpose the older chips for inferencing - and they’ll perform even better than when you first bought them. Damn good sales pitch if you ask me.

40 Likes

BTW, the episode the Huang is so information dense that the BG2 guys just did a whole episode on the ramifications of what they heard:

Less technical, more investing-related.

20 Likes

It would be an exaggeration to say I am trying to wrap my head around this. It seems to boggle their minds, whereas I don’t even know where to begin…

Slowly, even the sceptics are starting to think that maybe, just maybe, nvidia has a very long runway ahead of them.

6 Likes

In the first video, Huang talked a bit about Nvidia’s software prowess. In the second video, the BG2 guys put up a slide showing the company’s “Full Stack” (Probably taken from an Nvidia presentation):

The topmost box contains Nvidia libraries: cuDA, cuDNN, TensortRT, etc. Huang talked about how Nvidia’s software isn’t just CUDA (cuDA), but a number of algorithmic libraries that plug into other software offerings, like PyTorch and Tensor, etc., to accelerate development and runtime performance of Nvidia hardware. This was/is needed since programming for parallel computing is quite different than for single-threaded processing. The take-away for us investors, however, may be simply that efforts to replace cuDA may not be enough given all the other accelerating software Nvidia has developed for its ecosystem.

I still go back to the bit about how, thanks to Nvidia’s advancing software, their older chips will run faster now than when originally purchased.

BTW, the Full Stack diagram also helps explain how the company’s hardware offerings fit into each other. Blackwell is their current GPU, Grace is their current CPU (based on ARM), “Grace Blackwell”, aka the GB200, is both those on a board. And then all the networking: One path for ConectX/InfiniBand/NVLink, and one path for Bluefield-3/Spectrum-X ethernet/C2C&DSPs. And then Nvidia as a systems company, not just chips, with a purchasable box (GB200 NVL72) ready to install, or reference designs for building your own custom liquid cooling and power budgeting.

Companies like AMD and Intel are trying to catch up to Nvidia on GPUs, but even if they do eventually succeed (hah!) getting to that moving target they don’t supply the other parts of the build-out needed. While Huang gives huge props to Elon for X.ai’s rapid build-out, what made that possible was the complete turn-key systems Nvidia delivers today.

22 Likes

Right now at a macro view the bottlenecks I see for HPC are:

  1. Supply issues: even Huang could not anticipate the magnitude of GPU etc. demand or exactly when it would hit
  2. Liquid cooling: Facilities, equipment, expertise. It seems to me pretty much everyone was caught by surprise at the magnitude of demand and how quickly it hit.
  3. Network performance bottlenecks at the link and switch layers. This is where $ALAB fits in.

As a side note, in a nutshell my thesis for $ALAB is they find and alleviate the (very) few HPC bottlenecks that Huang did not anticipate. There aren’t many such bottlenecks, but if they are severe enough and if $ALAB has uniquely effective solutions, then $ALAB should do well.

11 Likes

Hi Smorg,

I appreciate your breaking down the Nvidia ‘full stack’ a little above.

When you said…
The topmost box contains Nvidia libraries: cuDA, cuDNN, TensortRT, etc. Huang talked about how Nvidia’s software isn’t just CUDA (cuDA), but a number of algorithmic libraries that plug into other software offerings, like PyTorch and Tensor, etc., to accelerate development and runtime performance of Nvidia hardware.

What I’m most interested in at this time, as an investor, is that horizontal lay just under that one you mentioned.

You wrote here on 5/24/23, your expectation of how the AI story unfolds is in three waves:
The first wave is chips, second wave is infrastructure & devices, and then the last wave, which is the largest wave, is the software and services sector.

While I believe this won’t straightforwardly map to the AI world (what new devices are needed and Infrastructure is the server-side deployment of the GPUs), I do think that we’re just starting to see what the software and services side of AI will be - and that it will eventually be bigger than the chips/infrastructure side.

That would mean: Nvidia today for Wave1 for Chips, Infrastructure companies like SMCI, AWS, Azure, GCP, etc. for Wave2 (Possibly Crowdsrike over endpoints as Infrastucture), and then the softwarecompanies. So, some patience might be warranted, or one could invest more in wave1 and wave2 companies today, being ready to move into the software side in a year or so(Wave3)

I’m thinking of this horizontal layer (the grey layer under the top grey layer you spoke of) is where the primitives of a software layer may be now considered to be as infrastructure like?. Back in May of 2023, I don’t think we understood what was going to unfold as well as we do now. Would you agree? And if so would you add to this layer Databricks, Snowflake and perhaps Palantir (CRM is betting the company on Agents, President of Product & Engineering at Sevice Now went to Cloudflare) ? Because I believe now we understand that millions of AI Agents are going to be automating companies and the 3rd wave you referenced in 2023 is just that. Maybe?

Edit:
I believe Nvidia is providing here the primitives for building AI agents, see Accenture partnership interview I wrote about recently.

With Accenture’s 30,000 employee GTM, Nvidia is going after Enterprise AI and Industrial AI, with this ‘AI Agents Development platform called AI Refinery’.

Best

Jason

17 Likes

Huang describes it here:

(starts at 14:20).

After a brief example of how Nvidia’s GPUs differ from traditional CPUs (like Intel’s X86), Huang talks about a replacement for OpenGL, which is a library for rendering 2D and 3D graphics. Nvidia needed a replacement for that, because OpenGL is single-threaded, not ready for parallel processing. So, yes, the software in that box sits under PyTorch and other programming languages.

That’s different than the application layer, which is built with those languages, on top. The applications are things like ChatGPT, for instance. Companies like Salesforce, Snowflake, Databricks, etc. are providing additional infrastructure (mostly Database). The “agents” you talk about are the applications being run to do something useful. It seems to be that Accenture and Salesforce are trying to build infrastructure on which AI Applications can be built, hopefully more easily than directly programming via CUDA or PyTorch.

It’s going to take time for such infrastructure efforts to become standardized, and I think that’s going to happen in parallel with applications (or agents I guess) being developed for people (and other agents) to use.

From that press release:

Accenture’s formation of a new NVIDIA Business Group will help clients lay the foundation for agentic AI functionality using Accenture’s AI Refinery™️, which uses the full NVIDIA AI stack—including NVIDIA AI Foundry, NVIDIA AI Enterprise and NVIDIA Omniverse—to advance areas such as process reinvention, AI-powered simulation and sovereign AI.

This is Accenture building a framework on which to develop AI applications. That framework is itself built on top of Nvidia’s software and hardware. Whether that framework or someone else’s becomes a standard, much less “the” standard, is very far from being settled.

And while ChatGPT is one useful AI application, there are going to be thousands of applications doing all sorts of things. Some will be doing things programmatic software does today, only better. Others will be doing things that are not being done by machine today, but by humans, and still others will be doing all new things.

Pretty exciting, but I don’t know how to know where all that is going to go, at least at this point in time.

23 Likes