Remix.run Logo
jitl 7 days ago

Not widely used but I like Typical's approach

https://github.com/stepchowfun/typical

> Typical offers a new solution ("asymmetric" fields) to the classic problem of how to safely add or remove fields in record types without breaking compatibility. The concept of asymmetric fields also solves the dual problem of how to preserve compatibility when adding or removing cases in sum types.

rkagerer 6 days ago | parent | next [-]

More direct link to the juicy bit: https://github.com/stepchowfun/typical?tab=readme-ov-file#as...

An asymmetric field in a struct is considered required for the writer, but optional for the reader.

sdenton4 6 days ago | parent [-]

That's a nice idea... But I believe the design direction of proto buffers was to make everything `optional`, because `required` tends to bite you later when you realize it should actually be optional.

bilkow 6 days ago | parent [-]

My understanding is that asymmetric fields provide a migration path in case that happens, as stated in the docs:

> Unlike optional fields, an asymmetric field can safely be promoted to required and vice versa.

> [...]

> Suppose we now want to remove a required field. It may be unsafe to delete the field directly, since then clients might stop setting it before servers can handle its absence. But we can demote it to asymmetric, which forces servers to consider it optional and handle its potential absence, even though clients are still required to set it. Once that change has been rolled out (at least to servers), we can confidently delete the field (or demote it to optional), as the servers no longer rely on it.

yencabulator 4 days ago | parent | next [-]

> My understanding is that asymmetric fields provide a migration path in case that happens, as stated in the docs:

If you can assume you can churn a generation of fresh data soonish, and never again read the old data. For RPC sure, but someone like Google has petabytes of stored protobufs, so they don't pretend they can upgrade all the writers.

sdenton4 4 days ago | parent | prev [-]

....or we can just say that everything is optional always, and leave it to the servers instead of the protocol to handle irregularities.

summerlight 7 days ago | parent | prev | next [-]

This seems interesting. Still not sure if `required` is a good thing to have (for persistent data like log you cannot really guarantee some field's presence without schema versioning baked into the file itself) but for an intermediate wire use cases, this will help.

cornstalks 7 days ago | parent | prev | next [-]

I've never heard of Typical but the fact they didn't repeat protobuf's sin regarding varint encoding (or use leb128 encoding...) makes me very interested! Thank you for sharing, I'm going to have to give it a spin.

zigzag312 7 days ago | parent [-]

It looks similar to how vint64 lib encodes varints. Total length of varint can be determined via the first byte alone.

haberman 7 days ago | parent [-]

I advocated for PrefixVarint (which seems equivalent to vint64 ) for WebAssembly, but it was decided against, in favor of LEB128: https://github.com/WebAssembly/design/issues/601

The recent CREL format for ELF also uses the more established LEB128: https://news.ycombinator.com/item?id=41222021

At this point I don't feel like I have a clear opinion about whether PrefixVarint is worth it, compared with LEB128.

zigzag312 7 days ago | parent | next [-]

Just remember that XML was more established than JSON for a long time.

kannanvijayan 4 days ago | parent | prev [-]

Varint encoding is something I've peeked at in various contexts. My personal bias is towards the prefix-style, as it feels faster to decode and the segregation of the meta-data from the payload data is nice.

But, the thing that tends to tip the scales is the fact that in almost all real world cases, small numbers dominate - as the github thread you linked relates in a comment.

The LEB128 fast-path is a single conditional with no data-dependencies:

  if ! (x & 0x80) { x }
Modern CPUs will characterize that branch really well and you'll pay almost zero cost for the fastpath which also happens to be the dominant path.

It's hard to beat.

yencabulator 4 days ago | parent [-]

SQLite format equivalent:

  if x <= 240 { x }
while strictly improving all other aspects (at least IMHO)

https://sqlite.org/src4/doc/trunk/www/varint.wiki

zigzag312 7 days ago | parent | prev | next [-]

This actually looks quite interesting.

sevensor 6 days ago | parent | prev | next [-]

Seems like a lot of effort to avoid adding a message version field. I’m not a web guy, so maybe I’m missing the point here, but I always embed a schema version field in my data.

vouwfietsman 6 days ago | parent | next [-]

I get that.

The point is that its hard to prevent asymmetry in message versions if you are working with many communicating systems. Lets say four services inter-communicate with some protocol, it is extremely annoying to impose a deployment order where the producer of a message type is the last to upgrade the message schema, as this causes unnecessary dependencies between the release trains of these services. At the same time, one cannot simply say: "I don't know this message version, I will disregard it" because in live systems this will mean the systems go out of sync, data is lost, stuff breaks, etc.

There's probably more issues I haven't mentioned, but long story short: in live, interconnected systems, it becomes important to have intelligent message versioning, i.e: a version number is not enough.

kiitos 2 days ago | parent | next [-]

> Lets say four services inter-communicate with some protocol, it is extremely annoying to impose a deployment order where the producer of a message type is the last to upgrade the message schema

i don't know how you arrived at this conclusion

the protocol is the unifying substrate, it is the source of truth, the services are subservient to the protocol, it's not the other way around

also it's not just like each service has a single version, each instance of each service can have separate versions as well!

what you're describing as "annoying" is really just "reality", you can't hand-wave away the problems that reality presents

1718627440 2 days ago | parent | prev | next [-]

> one cannot simply say: "I don't know this message version, I will disregard it" because in live systems this will mean the systems go out of sync, data is lost, stuff breaks, etc.

You already need to deal with lost messages, rejected messages, so just treat this case the same. If you have versions surely you have code to deal with mismatches and e.g. fail back to the older version.

sevensor 6 days ago | parent | prev [-]

I think I see what you’re getting at? My mental model is client and server, but you’re implying a more complex topology where no one service is uniquely a server or a client. You’d like to insert a new version at an arbitrary position in the graph without worrying about dependencies or the operational complexity of doing a phased deployment. The result is that you try to maintain a principled, constructive ambiguity around the message schema, hence asymmetrical fields? I guess I’m still unconvinced and I may have started the argument wrong, but I can see a reasonable person doing it that way.

vouwfietsman 6 days ago | parent [-]

Yes thats a big part, but even bigger is just the alignment of teams.

Imagine team A building feature XYZ Team B is building TUV

one of those features in each team deals with messages, the others are unrelated. At some point in time, both teams have to deploy.

If you have to sync them up just to get the protocol to work, thats an extra complexity in the already complex work of the teams.

If you can ignore this, great!

It becomes even more complex with rolling updates though: not all deployments of a service will have the new code immediately, because you want multiple to be online to scale on demand. This creates an immediate necessary ambiguity in the qeustion: "which version does this service accept?" because its not about the service anymore, but about the deployments.

sevensor 6 days ago | parent [-]

Ah, I see. Team A would like to deploy a new version of a service. It used to accept messages with schema S, but the new version accepts only S’ and not S. So the only thing you can do is define S’ so that it is ambiguous with S. Team B uses Team A’s service but doesn’t want to have to coordinate deployments with Team A.

I think the key source of my confusion was Team A not being able to continue supporting schema S once the new version is released. That certainly makes the problem harder.

vouwfietsman 5 days ago | parent [-]

Exactly!

vineyardmike 6 days ago | parent | prev [-]

Idk I generally think “magic numbers” are just extra effort. The main annoyance is adding if statements everywhere on version number instead of checking the data field you need being present.

It also really depends on the scope of the issue. Protos really excel at “rolling” updates and continuous changes instead of fixed APIs. For example, MicroserviceA calls MicroserviceB, but the teams do deployments different times of the week. Constant rolling of the version number for each change is annoying vs just checking for the new feature. Especially if you could have several active versions at a time.

It also frees you from actually propagating a single version number everywhere. If you own a bunch of API endpoints, you either need to put the version in the URL, which impacts every endpoint at once, or you need to put it in the request/response of every one.

sevensor 6 days ago | parent [-]

I think this is only a problem if you’re using a weak data interchange library that can’t use the schema number field to discriminate a union. Because you really shouldn’t have to write that if statement yourself.

atombender 6 days ago | parent | prev [-]

I'm really hoping Typical will catch on, as I quite like the design. One important gap right now is the lack of Go and Python support.