Remix.run Logo
k_bx 2 days ago

To piggyback on this, is anyone using flatbuffers? They solve same problem as CapnProto, are a basis of arrow.

I've used it but too long ago and don't know their current state.

plasticeagle 2 days ago | parent | next [-]

We use flatbuffers extensively in our message passing framework, and they are extremely fast if you take care with your implementation. They have a few features that make them especially useful for us

1) The flatbuffer parser can be configured at runtime from a schema file, so our message passing runtime does not to need to know about any schemas at build time. It reads the schema files at startup, and is henceforth capable of translating messages to and from JSON when required. It's also possible to determine that two schemas will be compatible at runtime.

2) Messages can be re-used. For our high-rate messages, we build a message and then modify it to send again, rather than building it from scratch each time.

3) Zero decode overhead - there is often no need to deserialise messages - so we can avoid copying the data therein.

The flatbuffer compiler is also extremely fast, which is nice at build time.

elcritch 2 days ago | parent | prev | next [-]

Used them before and they're ok. They were missing some important features like sum types. The code output was a pain, but targeted a few languages. My suspicion is that Captain Proto would be technically superior but less documented than flatbuffers.

However, my preference is to use something like MsgPack or CBOR with compile time reflection to do serde directly into types. You can design the types to require minimal allocations and to parse messages within a few dozen nanoseconds. That means doing things like using static char arrays for strings. It wastes a bit of space but it can be very fast. Also skipping out on spaced used by 64bit pointers can replace a lot of shorter text fields.

That said, I should wrap or port this UPB to Nim. It'd be a nice alternative if it's really as fast as claimed. Though how it handles allocating would be the clincher.

discreteevent 2 days ago | parent [-]

> They were missing some important features like sum types.

They support unions now. I haven't had any trouble representing anything including recursive structures.

> The code output was a pain, but targeted a few languages.

You do need a day or so to get used to the code. Its a pointer based system with a 'flat memory'. Order of operations matters. If you have a parent and a child you need to write the child first, obtain a pointer to it and only then create/write the parent containing the pointer to the child. Once you get used to this it goes quickly. The advantage is that you don't have to create an intermediate copy in memory when reading (like protobuf) and you can read any particular part by traversing pointers without having to load the rest of the data into memory.

crabbone 2 days ago | parent | prev | next [-]

Neither of them are well-designed or well thought-through. They address some cosmetic issues of Protobuf, but don't really deal with the major issues. So, you could say they are slightly better, but the original was bad enough to completely disqualify it either.

vvanders 2 days ago | parent [-]

Flatbuffers lets you directly mmap from disk, that trick alone makes it really good for use cases that can take advantage of it(fast access of read-only data). If you're clever enough to tune the ordering of fields you can give it good cache locality and really make it fly.

We used to store animation data in mmaped flatbuffers at a previous gig and it worked really well. Kernel would happily prefetch on access and page out under pressure, we could have 10s of MBs of animation data and only pay a couple hundred kb based on access patterns.

pantalaimon 2 days ago | parent | prev [-]

To piggyback on this, is anyone using CBOR? They solve same problem as Protobuf, but more like JSON where a schema is not required.

vamega 2 days ago | parent [-]

Amazon uses CBOR extensively. Most AWS services by now should support being called using CBOR. The protocol they're using is publicly documented at: https://smithy.io/2.0/additional-specs/protocols/smithy-rpc-...

The services serve both CBOR and other protocols simultaneously.