Remix.run Logo
adev_ 3 days ago

I have applied a subset of these techniques in a tokenizer in C++ to parse a language syntactically similar to Swift: no inline assembly, no intrinsics, no SWAR but reduce branching, cache optimization and SIMD parsing + explicit vectorization.

I get:

- ~4 MLOC/sec/core on a laptop

- ~ 8-9MLOC/sec/core on a modern AMD sever grade CPU with AVX512.

So yes, it is definitively possible.