Intel took a meaningful step backwards when they introduced efficiency cores that made them drop years of advances represented by AVX-512. From what I read a chip mixing cores where only a subset had full instruction set support proved impractical I guess when you are in a tight spot you play the hand you dealt yourself, and I imagine they will bring these capabilities back when they can but again this created an advantage for AMD. I don’t know enough to judge how big an advantage but in eg server products, to put product out that will live five or more years in the field with a feature regression like this… Ouch.
y-cruncher developer expresses disappointment in Intel"s decision to remove the instructions as they feel there are several consequent drawbacks, adding that it is difficult to optimize for:
I"ve been asked a number of times about why I haven"t done any optimizations for recent Intel processors. The latest Intel processor which y-cruncher has optimizations for is Tiger Lake which is 2 generations behind the latest (Raptor Lake). And because Raptor Lake lacks AVX512, it can only run a binary going all the way back to Skylake client (circa 2015).
[…]
Removing AVX512 is a huge step back in more ways than just the instruction width. It also removes all the other (non-width) functionality exclusive to AVX512 such as masking, all-to-all permutes, and increased register count. From a developer perspective, this very discouraging since most of the algorithms I"ve been working on since 2016 have been heavily influenced by (if not outright designed for) AVX512.
The lack of AVX512 is likely why Tiger Lake and Rocket Lake outperform Alder Lake in single-threaded benchmarks where memory bandwidth and core count are not a factor.
With its latest version 0.8.x, y-cruncher notes some significant performance boosts on Ryzen 7000 when computing Pi. It says AVX-512 can provide anywhere between 23-31% improvement compared to the previous version.