Remix.run Logo
dataflow 3 days ago

> Wouldn’t that be suboptimal if/when CPUs that support 1024-bit vectors come along?

Is that likely or on anyone's roadmap? It makes a little less sense than 512 bits, at least for Intel, since their cache lines are 64 bytes i.e. 512 bits. Any more than that and they'd have to mess with multiple cache lines all the time, not just on unaligned accesses. And they'd have to support crossing more than 2 cache lines on unaligned accesses. They increase the cache line size too, but that seems terrible for compatibility since a lot of programs assume it's a compile time constant (and it'd have performance overhead to make it a run-time value). Somehow it feels like this isn't the way to go, but hey, I'm not a CPU architect.