Remix.run Logo
cubefox 3 hours ago

Most people don't know that there are fundamentally two different kinds of "union" types: "tagged unions" and "untagged unions". Now .NET is introducing tagged unions, but unfortunately they stick to the popular tradition of calling them just "unions", which greatly adds to the confusion.

To clarify, these tagged unions are fundamentally different from the untagged unions that can be found in languages like Typescript or Scala 3.

Tagged unions (also called "discriminated unions" or "sum types") are algebraic data types. They act as wrappers, via special constructors, for two (or more) disjoint types and values. So a tagged union acts like a bag that has two (or more) labelled ("tagged") slots, where each slot has exactly one type and exactly one of these slot can take a value.

Untagged unions are set theoretic (rather than algebraic) data types. They don't require wrapping via a special constructor. They behave like the logical OR. They are not disjoint slots of a separate construct. A variable or function x with the type "Foo | Bar" can be of type Foo, or Bar, or both. To access a Foo method on x, one has to first perform a typecheck for Foo, otherwise the compiler will refuse the method call (since x might only have the type Bar which would produce an exception). If a variable is of type A, it is also of type A|B ("A or B"). There are also intersection types (A&B) which indicate that something has both types rather than at least one, and complement/negation types (~A indicates that something is of any type except A). Though the latter are not implemented in any major language so far.

troad 2 hours ago | parent [-]

Underlyingly, all unions are the same concept, exposed by C's `union` keyword, namely overlapping memory (as opposed to `struct`s, which are sequential memory). You can add a discriminator tag to your unions, your call. If it is impossible for there to be ambiguity (e.g. you track the state somewhere else, and you're dealing with a large array), it may be redundant / memory inefficient to tag every item.

Having these tags be a language construct is just a DX feature on top of unions. A very handy one, but it doesn't make tagged and untagged unions spring from different theoretical universes. I enjoy the ADT / set-theoretic debate as much as the next PL nerd, but theory ought to conform to reality, not vice versa.