▲ | Const-me a day ago | |||||||||||||||||||||||||
> Actually there is - you can exploit the data parallelism That doesn’t help much when you’re streaming the bytes, like many parsers or de-serializers do. You have to read bytes from the source stream one by one because each of the next byte might be the last one of the integer being decoded. You could workaround with a custom buffering adapter. Hard to do correctly, costs performance, and in some cases even impossible: when the encoded integers are coming in realtime from network, trying to read next few bytes into a RAM buffer may block. With MKV variable integers, you typically know the encoded length from just the first byte. Or in very rare cases of huge uint64 numbers, you need 2 bytes for that. Once you know the encoded length, you can read the rest of the encoded bytes with a single read function. Another thing, unless you are running on a modern AMD64 CPU which supports PDEP/PEXT instructions (parts of BMI2 ISA extension), it’s expensive to split/merge the input bits to/from these groups of 7 bits in every encoded byte. MKV variable integers don’t do that, they only need bit scan and byte swap, both instructions are available on all modern mainstream processors including ARM and are very cheap. | ||||||||||||||||||||||||||
▲ | menaerus a day ago | parent [-] | |||||||||||||||||||||||||
But you never serialize a byte by byte over the network. You encode, let's say, 1000 varints and then send them out on the wire. On the other end, you may or may not know how many varints there are to unpack but since you certainly have to know the length of the packet you can start decoding on a stream of bytes. Largest 64-bit leb128 number occupies 10 bytes whereas the largest 32-bit one occupies 5 bytes so we know the upper bound. | ||||||||||||||||||||||||||
|