| ▲ | boricj 20 minutes ago | |
I'm working on a C++ library at work that binds wire data and application data through token and model layers, which includes among other things a fair amount of tokenizers/composers for various formats (JSON, CBOR, BSON, CSV...). This looks neat, but if encoding/decoding performance is important, payload size isn't and the integer is bounded, I would just put a fixed-size integer into the payload as-is. LEB128 (and JSON for that matter) can encode integer values of arbitrary length. This doesn't, which may or may not be important but it's different. I'll admit that I do not do any cryptographic work with my library and therefore canonical representations aren't a huge concern in my use-cases. I merely provide various configurable limits (max value length, max depth, max items per collection) in an effort to prevent infinitely long documents from hogging my tokenizers indefinitely. | ||