▲ | menaerus a day ago | ||||||||||||||||||||||||||||||||||
> LEB128 is slow and there’s no way around that. Actually there is - you can exploit the data parallelism through SIMD to replace the logarithmic with near-constant complexity. Classic approach indeed is very slow and unfriendly to the CPU. | |||||||||||||||||||||||||||||||||||
▲ | Const-me a day ago | parent [-] | ||||||||||||||||||||||||||||||||||
> Actually there is - you can exploit the data parallelism That doesn’t help much when you’re streaming the bytes, like many parsers or de-serializers do. You have to read bytes from the source stream one by one because each of the next byte might be the last one of the integer being decoded. You could workaround with a custom buffering adapter. Hard to do correctly, costs performance, and in some cases even impossible: when the encoded integers are coming in realtime from network, trying to read next few bytes into a RAM buffer may block. With MKV variable integers, you typically know the encoded length from just the first byte. Or in very rare cases of huge uint64 numbers, you need 2 bytes for that. Once you know the encoded length, you can read the rest of the encoded bytes with a single read function. Another thing, unless you are running on a modern AMD64 CPU which supports PDEP/PEXT instructions (parts of BMI2 ISA extension), it’s expensive to split/merge the input bits to/from these groups of 7 bits in every encoded byte. MKV variable integers don’t do that, they only need bit scan and byte swap, both instructions are available on all modern mainstream processors including ARM and are very cheap. | |||||||||||||||||||||||||||||||||||
|