The SW26010 was rated at 3.06 teraflops peak and ran at 1.45 GHz. If you did a die shrink on this chip from 28 nanometers and then increased the cores by 50 percent and then doubled up the vector width to 512-bits, while keeping the clock speed the same, then you would create a device that delivered 9.2 teraflops. But this SW26010-Pro chip delivers 14.03 teraflops, and so the clock speed much have increased by 52.7 percent to 2.22 GHz to reach that performance level.
Add it all up, and the 105 cabinet system tested on the BaGuaLu training model, with its 107,250 SW26010-Pro processors, had a peak theoretical performance of 1.51 exaflops. We like base 2 numbers and think that the OceanLight system probably scales to 160 cabinets, which would be 163,840 nodes and just under 2.3 exaflops of peak FP64 and FP32 performance. If it is only 120 cabinets (also a base 2 number), OceanLight will come in at 1.72 exaflops peak. But these rack scales are, once again, just hunches.
If the 160 cabinet scale is the maximum for OceanLight, then China could best the performance of the 1.5 exaflops “Frontier” supercomputer being tuned up at Oak Ridge National Laboratories today and also extend beyond the peak theoretical performance of the 2 exaflops “Aurora” supercomputer coming to Argonne National Laboratory later this year – and maybe even further than the “El Capitan” supercomputer going into Lawrence Livermore National Laboratory in 2023 and expected to be around 2.2 exaflops to 2.3 exaflops according to the scuttlebutt.
We would love to see the thermals and costs of OceanLight. The SW26010-Pro chip could burn very hot, to be sure, and run up the electric bill for power and cooling, but if SMIC can get good yield on 14 nanometer processes, the chip could be a lot less expensive to make than, say, a massive GPU accelerator from Nvidia, AMD, or Intel. (It’s hard to say.) Regardless, having indigenous parts matters more than power efficiency for China right now, and into its future, and we said as much last summer when contemplating China’s long road to IT independence. Imagine what China can do with a shrink to 7 nanometer processes when SMIC delivers them – apparently not even using extreme ultraviolet (EUV) light – many years hence. . . .
The bottom line is that NRCPC, working with SMIC, has had an exascale machine in the field for a year already. (There are two, in fact.) Can the United States say that right now? No it can’t. The United States is counting on its exascale machines to be more energy efficient – Frontier and El Capitan for sure, we shall see with Aurora – but we have no idea how computationally efficient any of these future machines really are.