Remix clone Hacker News

AVX-512 also has a lot of wonderful facilities for autovectorization, but I suspect its initial downclocking effects plus getting yanked out of Alder Lake killed a lot of the momentum in improving compiler and library usage of it.

Even the Steam Hardware Survey, which is skewed toward upper end hardware, only shows 16% availability of baseline AVX-512, compared to 94% for AVX2.

▲

adgjlsfhk1 15 hours ago | parent | next [-]

It will be interesting seeing what happens now that AMD is shipping good AVX-512. It really just makes Intel seem incompetent (especially since they're theoretically bringing AVX-512 back in next year anyway)

▲

ack_complete 14 hours ago | parent [-]

No proof, but I suspect that AMD's AVX-512 support played a part in Intel dumping AVX10/256 and changing plans back to shipping a full 512-bit consumer implementation again (we'll see when they actually ship it).

The downside is that AMD also increased the latency of all formerly cheap integer vector ops. This removes one of the main advantages against NEON, which historically has had richer operations but worse latencies. That's one thing I hope Intel doesn't follow.

Also interesting is that Intel's E-core architecture is improving dramatically compared to the P-core, even surpassing it in some cases. For instance, Skymont finally has no penalty for denormals, a long standing Intel weakness. Would not be surprising to see the E-core architecture take over at some point.

▲

adgjlsfhk1 11 hours ago | parent [-]

> For instance, Skymont finally has no penalty for denormals, a long standing Intel weakness.

yeah, that's crazy to me. Intel has been so completely discunctional for the last 15 years. I feel like you couldn't have a clearer sign of "we have 2 completely separate teams that are competing with each other and aren't allowed to/don't want to talk to each other". it's just such a clear sign that the chicken is running around headless

	▲	whizzter 6 hours ago \| parent [-]
		Not really, to me it more seems like Pentium-4 vs Pentium-M/Core again. The downfall of Pentium 4 was that they had been stuffing things into longer and longer pipes to keep up the frequency race(with horrible branch latencies as a result). They scaled it all away by "resetting" to the P3/P-M/Core architecture and scaling up from that again. Pipes today are even _longer_ and if E-cores has shorter pipes at a similar frequency then "regular" JS,Java,etc code will be far more performant even if you lose a bit of perf for "performance" cases where people vectorize (Did the HPC computing crowd influence Intel into a ditch AGAIN? wouldn't be surprising!).

▲

ezekiel68 7 hours ago | parent | prev [-]

You mentioned "initial downclocking effects", yet (for posterity) I want to emphasize that in 2020 Ice Lake (Sunny Cove core) and later Intel processors, the downclocking is really a nothingburger. The fusing off debacle in desktop CPU families like Alder Lake you mentioned definitely killed the momentum though.

I'm not sure why OS kernels couldn't have become partners in CPU capability queries (where a program starting execution could request a CPU core with 'X' such as AVX-512F, for example) -- but without that the whole P-core/E-core hybrid concept was DOA for capabilities which were not least-common denominators. If I had to guess, marketing got ahead of engineering and testing on that one.