Power10: Still great for some workloads


IBM Power and Intel Xeon SP had roadmap delays caused by foundry issues – IBM with the 10 nanometer and then 7 nanometer processes from GlobalFoundries, which never made it to market with either, and Intel with its own 10 nanometer issues. So things are a bit out of whack compared to historical trends. But, in terms of raw performance, the 2X rule for Power cores is holding, as is attested by the performance ratings for the Power E1050 shown above.

The Power10 core running at 3.15 GHz has a quad of Power10 dual chip modules (DCMs) for a total of 96 cores, and has a SPECint_rate_2017 integer throughput rating of 1,580. The four-socket Inspur machine equipped with four Xeon SP-8380H Platinum processors running at 2.9 GHz has a total of 112 cores, and is rated at only 846 on the SPEC integer throughput test. If you do the math, the Power E1050 offers 1.9X more performance at the system level (which is due entirely to shifting to DCMs) and each core has 2.2X more performance if you divide the integer throughput by the number of cores.

Interestingly, an eight-socket HPE Superdome Flex 280 server using the same Cooper Lake processors is rated at 1,620 on the SPEC_int_rate_2017 test (that’s a peak rating with tuning). The eight socket Xeon SP server has 2.5 percent more performance, but it takes 2.3X times as many cores to get there. If enterprises are using software that is priced by the core – as database and datastore software often is – then the Xeon SP server is no doubt cheaper, but the software is going to be more expensive unless vendors give the X86 architecture a 50 percent price break.

The Power E1050 stacks up similarly on the SPECjbb stock trading benchmark that is based roughly on the TPC-E benchmark. At the maximum throughput, the IBM Cirrus system offers 2.1X the performance per core and 1.8X more performance per system than a four-socket Intel Cooper Lake machine.

On the SAP Sales and Distribution (SD) benchmark test, which has been used for two decades to gauge relative transaction processing performance of enterprise systems, the performance gaps are a little bit bigger. Take a look:

This time, IBM pulled out benchmark tests from Dell on its Power Edge 840, which was equipped with four “Cascade Lake” Xeon SP-8280 processors running at 2.7 GHz and the HPE Superdome Flex 280 using eight “Cooper Lake” Xeon SP-8380H processors running at 2.9 GHz. The Dell machine was running Linux and the SAP ASE database and the HPE machine was running Windows Server 2016 and SQL Server 2012. These software releases are a little old, but those tests were done by the vendor, not by IBM.

In any event, on the SAP SD test, the Power E1050 had 1.9X more system throughput and 2.3X more per core performance than the Xeon SP machine. (Obviously, an Ice Lake system would have closed some of that gap, and so will a Sapphire Rapids NUMA machine). But on a big eight-way machine, even with core improvements and clock speed boosts, the IBM machine is delivering 2.6X the performance per core, and 1.9X more system throughput on the SAP test.

Why IBM is not banging the drum about this is beyond us. . . . Probably because someone would bring up the performance of AMD Epyc processors. But the per-core pricing is even worse for AMD for systems software.