Remix.run Logo
CannotCarrot 4 days ago

I think there's a generic C fallback, which can also serve as a baseline. But for the big (targeted) architectures, there one handwritten assembly version per arch.

faluzure 4 days ago | parent [-]

Yup.

On startup, it runs cpuid and assigns each operation the most optimal function pointer for that architecture.

In addition to things like ‘supports avx’ or ‘supports sse4’ some operations even have more explicit checks like ‘is a fifth generation celeron’. The level of optimization in that case was optimizing around the cache architecture on the cpu iirc.

Source: I did some dirty things with chromes native client and ffmpeg 10 years ago.