Remix.run Logo
crdrost 2 days ago

Approximate expansion of the original claim, without direct endorsement:

Suppose you have an Order object that needs to track where some larger process is in relation to three subtasks. We could imagine say that the Inventory department needs to set a physical object aside, then the Billing department needs to successfully charge for it, then the Shipping department needs to retrieve that physical object and ship it.

You start from a description of this as "one Order contains three Subtasks" where a Subtask contains the Go-style (Optional[Result], Optional[Error]) type. This architecture almost fits into a relational database, except that foreign key constraints are a bit funky if you shove everything into one nullable Result column. But let's just have the Result be some random JSON in the Subtasks table and let our relational purists weep.

Then you read this advice and you start to see that this allows for a lot of illegal states: things could contain both a result AND an error, or neither. You eventually decide that neither, is an allowed state. These are two boolean flags representing only 3 legal states and so they need to be factored into an enum: the enum is "Pending | Success[Result] | Failure[Error]".

Well, except the problem is a bit more nuanced because the pending-states also need to be consistent among the different subtasks: there is a dependency graph among them. So you should actually have an enum that says:

    Inventory_Pending
    Inventory_Failure[Error]
    Inventory_OK_Billing_Pending[InventoryData]
    Inventory_OK_Billing_Failure[InventoryData, Error]
    Inventory_OK_Billing_OK_Shipping_Pending[InventoryData, BillingData]
    Inventory_OK_Billing_OK_Shipping_Failure[InventoryData, BillingData, Error]
    Inventory_OK_Billing_OK_Shipping_OK[InventoryData, BillingData, ShippingData]
See, you would have had 3x3x3 = 27 valid states before for the Order but we have reduced to only the 7 legal states. Yay!

But now consider e.g. the following mutation. On Failure cases the executives at our company mandate that we never return a failed Order to a Pending status, rather we must always create a separate Order. This Order might skip inventory and/or billing and those need to be represented separately, as Inventory Skipped[OrderID] or InventoryAndBillingSkipped[OrderID]. So now our list of states following the "no unrepresentable state" logic, should really be:

    [... the 7 above, plus ...]
    Inventory_Skipped_Billing_Pending[OrderID]
    Inventory_Skipped_Billing_Failure[OrderID, Error]
    Inventory_Skipped_Billing_OK_Shipping_Pending[OrderID, BillingData]
    Inventory_Skipped_Billing_OK_Shipping_Failure[OrderID, BillingData, Error]
    Inventory_Skipped_Billing_OK_Shipping_OK[OrderID, BillingData, ShippingData]
    Inventory_And_Billing_Skipped_Shipping_Pending[OrderID]
    Inventory_And_Billing_Skipped_Shipping_Failure[OrderID, Error]
    Inventory_And_Billing_Skipped_Shipping_OK[OrderID, ShippingData]
Now someone else wants to add remediation actions, but only to remediate the exact error in the failure state, so _Failure is going to mean "no remediation taken" but we need to add some _Remediation with a boolean saying whether that process has completed or not. So we add:

    Inventory_Remediation[Error, Bool, Array[RemediationEvent]]
    Inventory_OK_Billing_Remediation[InventoryData, Error, Bool, Array[RemediationEvent]]
    Inventory_OK_Billing_OK_Shipping_Remediation[InventoryData, BillingData, Error, Bool, Array[RemediationEvent]]
    Inventory_Skipped_Billing_Remediation[OrderID, Error, Bool, Array[RemediationEvent]]
    Inventory_Skipped_Billing_OK_Shipping_Remediation[OrderID, BillingData, Error, Bool, Array[RemediationEvent]]
    Inventory_And_Billing_Skipped_Shipping_Remediation[OrderID, Error, Bool, Array[RemediationEvent]]
We're only up to 21 total states so far which is still probably manageable? But these changes do demonstrate exponential growth, which is a technical term that means that the growth of each step is some small fraction of the total growth that has happened up until that point. Because everything depends on Inventory (it's at the root of the tree), when we add a new state that the Inventory can be in (Skipped) we have to add enum cases for all of the other states, and we pay a cost proportional to the size of the tree. Similarly when everything can have an error (at the leaves of the tree), when we add a new uniform requirement for errors we have to add new leaves all throughout the tree and we pay a cost proportional to the size of the tree. (Another thing to notice about the Remediation state is that it is a Pending state for another Subtask that could have been added to the original Order whenever something moves into Failure mode.)

You get something powerful by reducing the 256ish-or-whatever states into the 21 legal states; you have a compile-time assurance that no bugs in your code have created weird states that can propagate their weirdnesses throughout the system. But you also have to maintain the 21 legal states all at once, instead of maintaining 4 subtasks each having one of 4 statuses.