Remix.run Logo
AceJohnny2 3 days ago

Offtopic, but is anyone using CapnProto, the ProtoBuf former maintainer's (kentonv around here) subsequent project?

https://capnproto.org

If so, how does it compare in practice?

(what does Cloudflare Workers use?)

necubi 3 days ago | parent | next [-]

Cloudflare is almost entirely run on Cap'n Proto, including the entire workers platform

kentonv 3 days ago | parent [-]

The Workers platform uses Cap'n Proto extensively, as one might expect (with me being the author of Cap'n Proto and the lead dev on Workers). Some other parts of Cloudflare use it (including the logging pipeline, which used it before I even joined), but there are also many services using gRPC or just JSON. Each team makes their own decisions.

I have it on my TODO list to write a blog post about Workers' use of Cap'n Proto. By far the biggest wins for us are from the RPC system -- the serialization honestly doesn't matter so much.

That said, the ecosystem around Cap'n Proto is obviously lacking compared to protobuf. For the Cloudflare Workers Runtime team specifically, the fact that we own it and can make any changes we need to balances this out. But I'm hesitant to recommend it to people at other companies, unless you are eager to jump into the source code whenever there's a problem.

anonymoushn 2 days ago | parent | next [-]

A while ago I talked to some team that was planning a migration to GraphQL, long after this was generally thought to be a bad idea. The lead seemed really attached to the "composable RPCs" aspect of the thing, and at the time it seemed like nothing else offered this. It would be quite cool if capnproto became a more credible option for this sort of situation. At the time users could read about the rpc composition/promise passing/"negative latency" stuff, but it was not quite implemented.

stouset 2 days ago | parent | prev | next [-]

This makes me really sad. Protobufs are not all that great, but they were there first and “good enough”.

It’s frustrating when we can’t have nice things because a mediocre Google product has sucked all the air out of the room. I’m not only talking about Protobufs here either.

motorest 2 days ago | parent | prev [-]

Thank you for posting here. Always insightful, always a treat.

k_bx 2 days ago | parent | prev | next [-]

To piggyback on this, is anyone using flatbuffers? They solve same problem as CapnProto, are a basis of arrow.

I've used it but too long ago and don't know their current state.

plasticeagle 2 days ago | parent | next [-]

We use flatbuffers extensively in our message passing framework, and they are extremely fast if you take care with your implementation. They have a few features that make them especially useful for us

1) The flatbuffer parser can be configured at runtime from a schema file, so our message passing runtime does not to need to know about any schemas at build time. It reads the schema files at startup, and is henceforth capable of translating messages to and from JSON when required. It's also possible to determine that two schemas will be compatible at runtime.

2) Messages can be re-used. For our high-rate messages, we build a message and then modify it to send again, rather than building it from scratch each time.

3) Zero decode overhead - there is often no need to deserialise messages - so we can avoid copying the data therein.

The flatbuffer compiler is also extremely fast, which is nice at build time.

elcritch 2 days ago | parent | prev | next [-]

Used them before and they're ok. They were missing some important features like sum types. The code output was a pain, but targeted a few languages. My suspicion is that Captain Proto would be technically superior but less documented than flatbuffers.

However, my preference is to use something like MsgPack or CBOR with compile time reflection to do serde directly into types. You can design the types to require minimal allocations and to parse messages within a few dozen nanoseconds. That means doing things like using static char arrays for strings. It wastes a bit of space but it can be very fast. Also skipping out on spaced used by 64bit pointers can replace a lot of shorter text fields.

That said, I should wrap or port this UPB to Nim. It'd be a nice alternative if it's really as fast as claimed. Though how it handles allocating would be the clincher.

discreteevent 2 days ago | parent [-]

> They were missing some important features like sum types.

They support unions now. I haven't had any trouble representing anything including recursive structures.

> The code output was a pain, but targeted a few languages.

You do need a day or so to get used to the code. Its a pointer based system with a 'flat memory'. Order of operations matters. If you have a parent and a child you need to write the child first, obtain a pointer to it and only then create/write the parent containing the pointer to the child. Once you get used to this it goes quickly. The advantage is that you don't have to create an intermediate copy in memory when reading (like protobuf) and you can read any particular part by traversing pointers without having to load the rest of the data into memory.

crabbone 2 days ago | parent | prev | next [-]

Neither of them are well-designed or well thought-through. They address some cosmetic issues of Protobuf, but don't really deal with the major issues. So, you could say they are slightly better, but the original was bad enough to completely disqualify it either.

vvanders 2 days ago | parent [-]

Flatbuffers lets you directly mmap from disk, that trick alone makes it really good for use cases that can take advantage of it(fast access of read-only data). If you're clever enough to tune the ordering of fields you can give it good cache locality and really make it fly.

We used to store animation data in mmaped flatbuffers at a previous gig and it worked really well. Kernel would happily prefetch on access and page out under pressure, we could have 10s of MBs of animation data and only pay a couple hundred kb based on access patterns.

pantalaimon 2 days ago | parent | prev [-]

To piggyback on this, is anyone using CBOR? They solve same problem as Protobuf, but more like JSON where a schema is not required.

vamega 2 days ago | parent [-]

Amazon uses CBOR extensively. Most AWS services by now should support being called using CBOR. The protocol they're using is publicly documented at: https://smithy.io/2.0/additional-specs/protocols/smithy-rpc-...

The services serve both CBOR and other protocols simultaneously.

karel-3d 2 days ago | parent | prev [-]

Yes, I am, it's fast and great, but the UX in go is a bit annoying since you need to constantly keep checking for errors on literally every set (to check for possible area errors). So your code has even more `if err != nil {return nil, err}` than usual go code.