Remix.run Logo
my123 2 hours ago

SME2 is restricted in scope to matrix multiply workloads and isn't really designed for anything else.

The point of streaming SVE is to have a way to pre/post process data on the way in or out of a matrix multiply.

A list that I have around of chips which support various levels of SVE:

For SVE(1) deployment, chips that have it: - Fujitsu A64fx - AWS Graviton3

SVE2: - Snapdragon X2, 8/8 Elite Gen 5 and later - MediaTek Dimensity 9000 and later - NVIDIA Tegra Thor and later, NVIDIA "N1" or later (GB10 is an "N1x" SKU) - Samsung Exynos 2200 or later - AWS Graviton4, Microsoft Cobalt 100, Google Axion (and newer chips) - CIX P1

SME(1) instead of SME2:

- Snapdragon X2, 8/8 Elite Gen 5

SME2:

- Apple M4, A18 and later - Samsung Exynos 2600 - MediaTek Dimensity 9500

Note that the Snapdragon 8/8 Elite Gen 5 and X2 support sve2 but not svebitperm.