NVDA: Cerebras, AI start-up

http://www.barrons.com/articles/cerebras-founder-feldman-con…

Cerebras Founder Feldman Contemplates the A.I. Chip Age

…Our discussion really stuck to how what he’s working on, and others, he expects, moves past the status quo of both microprocessors from Intel (INTC) and GPUs from Nvidia (NVDA).

The problem is that Intel CPUs, which for decades have been built to be optimal for “complex logic operations.” Machine learning consists of many relatively simple logic operations that need to interact. Connectivity, not the complexity of each arithmetic operation, is the critical factor, given that machine learning methods such as “convolutional neural networks” involve recursion, feedback, and many ways in which computations in one instance feeding into computations elsewhere in the process.

“CPUs are remarkably poor” says Feldman at communicating between one another, in terms of moving the results of logic operations from one CPU to another. Despite years of adding cache memory and widening the path of the bus to memory sources, the CPU still is an overly complex part that doesn’t communicate well.

“You want something that is simple on compute, on arithmetic, and very intense on communications,” says Feldman.

GPUs, such as Nvidia’s “Volta,” improve in some ways, in the sense that they consist of many — sometimes thousands — of compute units, stripping away complexity. But they also were not designed principally to communicate between individual logic units in a GPU. Over time, Nvidia has added communications abilities on chip, to adjust a part that was designed for rendering pixels to something that can communicate between functional elements.

But the issue lies more deeply. The principle mathematical problem for machine learning, says Feldman, is one of “sparse matrices,” meaning, a matrix that has many elements that are zero. A microprocessor wastes a lot of effort with a sparse matrix. “What is the point of multiplying by zero?” asks Feldman rhetorically. A GPU is a bit smarter, but really it’s optimized for “dense” matrix manipulation, he says.

A new chip is needed to handle sparse matrix math, and to emphasize communications between inputs and outputs of calculations. What that means specifically in the case of Cerebras will become clearer in the new year. But the over-riding point Feldman makes is that “for the first time in a long time, the workloads that need to be done are demanding lots of innovation in chips,” which means semiconductors are more interesting in many ways than they have been in years…

7 Likes

But the issue lies more deeply. The principle mathematical problem for machine learning, says Feldman, is one of “sparse matrices,” meaning, a matrix that has many elements that are zero. A microprocessor wastes a lot of effort with a sparse matrix. “What is the point of multiplying by zero?” asks Feldman rhetorically. A GPU is a bit smarter, but really it’s optimized for “dense” matrix manipulation, he says.

Thanks for the link. My research thesis was for parallel solvers handling sparse matrices. This was in early 2000s. I will certainly need to take a deeper look.

1 Like