What do you suppose Nvidia has in the lab that they’ve not made public? My guess is they already in prototype form chips that are 3x, 5x or more than what they have in production.
The issue would not be that Nvidia can’t also build chips that perform as well as Habana Labs but the issue seems to be that the AI inference chip market looks to be developing in a way in which the market might fracture and the chips become commoditized.
GPUs seems not be a area that has gotten totally commoditized. Nvidia can charge premium pricing in many areas in which GPUs get used because Nvidia seems to have a narrow moat around GPUs because of CUDA and other technical advantages. Not only does Nvidia get considered to make the best GPUs but they often lock customers in to using Nvidia chips. It’s not impossible for customers to switch to AMD but it can be a hassle…enough of a hassle for Nvidia to have a narrow moat.
Many people have been and still would be assuming that Nvidia will become as dominant in AI chips as they would be in GPUs because people assume that GPUs will become the dominant AI chip well into the future. The problem would be that the GPU chip has deficiencies that make GPUs to likely only be a short term answer for AI applications, not a long term answer.
GPUs have become dominant for use in AI Training but slowly but surely it’s becoming obvious that the GPU will likely not dominate inference and that’s not just me saying that…industry experts have been saying that and the actions by various players that will use AI inference chips from Hyperscalers like Google, Facebook, Microsoft to companies intending on playing on the edge have been indicating that.
If people truly look at the Facebook Glow project https://facebook.ai/developers/tools/glow , at it’s essence GLOW does for AI chip inference makers what Android does for hardware manufacturers not named Apple. Android relieves manufacturers of having to build a operating system for their hardware, while GLOW would in a similar fashion relieves the need of hardware manufacturers of building their own compiler software and instead rely on GLOW which would be community-driven open source compiler software.
The end result for customers of AI chips would that it prevents or makes it harder for AI chip companies to lock in customers because theoretically, it makes AI chips easier to be replaced.
Some news about Facebook Glow:
At Facebook’s 2018 @Scale conference in San Jose, California today, the company announced broad industry backing for Glow, its machine learning compiler designed to accelerate the performance of deep learning frameworks. Cadence, Esperanto, Intel, Marvell, and Qualcomm committed to supporting Glow in future silicon products.
Why would Facebook create GLOW? I would guess that Facebook has no desire to be locked into buying chips from only one vendor. I mean currently Nvidia’s data center inference chip cost almost $9,000 bucks. What company in their right mind wants to be locked in to paying Nvidia $9,000 buck/chip?? Totally not scalable.
Facebook and other hyperscalers desire a environment in which competition drives chip prices more down into the hundreds of dollars level instead of the thousands of dollars level. Facebook and other hyperscalers want to easily switch to the latest & newest chips as they become available and not become dependent on only one company to technologically advance the field. The Hyperscalers do not want to be locked in.
With big companies like Cadence, Esperanto, Intel, Marvell, Nvidia and Qualcomm, along with around 50 start-ups, along with some “customers” (like Google, Apple, etc.) deciding to build their own inference chips, the AI inference market will likely fracture.
The problem with the GPU as a inference chip would be that a pure GPU chip can not compete with the TPU as a inference chip. Nvidia says that in so many words because Nvidia’s inference chips have a Tensor core (the idea “borrowed” from Google as TPU stands for tensor processing unit):
NVIDIA® Tesla® GPUs are powered by Tensor Cores, a revolutionary technology that delivers groundbreaking AI performance. Tensor Cores can accelerate large matrix operations, which are at the heart of AI, and perform mixed-precision matrix multiply and accumulate calculations in a single operation. With hundreds of Tensor Cores operating in parallel in one NVIDIA GPU, this enables massive increases in throughput and efficiency
I have seen some call that the “TPU-ification of the GPU”. Now, while Nvidia might have a moat around their GPU technology, if & when they come out on the field a compete with other players using similar technology to the Google’s and Habana Labs, they won’t have that same advantage. Nvidia will likely then be competing on a level playing field
I think Nvidia locking lots of customers into their AI inference platform while becoming the “Intel of AI Chips” has a low likelihood of happening. The dreams of having Nvidia selling AI chips for thousands of dollars at huge margins for long periods of time…ummmm…well, I would not model that in to any projections about Nvidia because I view such a thing as very speculative to the point where that scenario has a low likelihood of happening.
Will Nvidia be a player in AI chips? Almost certainly but Nvidia might not gain as much out of AI as the hype suggests. The hype suggests little competition in AI far into the future and the reality might be in three to five years there might be a vastly different scenarios occurring as far as competition might be concerned…