Remix.run Logo
evelant 4 days ago

For a text document a normal CRDT is perfect. They're very good for that specific case. What I tried to solve is eventual consistency that _also_ preserves application semantics. For example a task tracker:

* first update sets task cancelled_at and cancellation_reason

* second update wants the task to be in progress, so sets started_at

CRDT's operate only at the column/field level. In this situation you'd have a task with cancelled_at, cancellation_reason, status in progress, and started_at. That makes no sense semantically, a task can't both be cancelled and in progress. CRDTs do nothing to solve this. My solution is aimed at exactly this kind of thing. Since it replicates _intentions_ instead of just data it would work like this:

action1: setCancelled(reason) action2: setInProgress

When reconciling total order of actions using logical clocks the app logic for setCancelled runs first then setInProgress runs second on every client once they see these actions. The app logic dictates what should happen, which depends on the application. You could have it discard action2. You could also have it remove the cancellation status and set in_progress. It depends on the needs of the application but the application invariants / semantics are preserved and user intentions are preserved maximally in a way that plain CRDTs cannot do.

josephg 4 days ago | parent [-]

Yes; I get all that from the readme. You pick an arbitrary order for operations to happen in. What I don't understand is how that helps when dealing with conflicts.

For example, lets say we have a state machine for a task. The task is currently in the IN_PROGRESS state - and from here it can transition to either CANCELLED or COMPLETE. Either of those states should be terminal. That is to say, once a task has been completed it can't be cancelled and vice versa.

The problem I see with your system is - lets say we have a task in the IN_PROGRESS state. One peer cancels a task and another tries to mark it complete. Lets say a peer sees the COMPLETE message first, so we have this:

    IN_PROGRESS -> COMPLETE
But then a peer sees the CANCEL message, and decides (unambiguously) that it must be applied before the completion event. Now we have this:

    IN_PROGRESS -> CANCELLED (-> COMPLETE ignored)
But this results in the state of the task visibly moving from the COMPLETE to CANCELLED state - which we said above the system should never do. If the task was complete, it can't be cancelled. There are other solutions to this problem, but it seems like the sort of thing your system cannot help with.

In general, CRDTs never had a problem arbitrarily picking a winner. One of the earliest documented CRDTs was a "Last-writer wins (LWW) register" which is a register (ie variable) which stores a value. When concurrent changes happen, the register chooses a winner somewhat arbitrarily. But the criticism is that this is sometimes not the application behaviour what we actually want.

You might be able to model a multi-value (MV) register using your system too. (Actually I'm not sure. Can you?) But I guess I don't understand why I would use it compared to just using an MV register directly. Specifically when it comes to conflicts.

evelant 4 days ago | parent [-]

It does not pick an arbitrary order for operations. They happen in total (known at the time, eventually converging) order across all clients thanks to hybrid logical clocks. If events arrive that happened before events a client already has locally it will roll back to that point in time and replay all of the actions forward in total ordering.

As for the specific scenario, if a client sets a task as COMPLETE and another sets it as CANCELLED before seeing the COMPLETE from the other client here's what would happen.

Client1: { id: 1, action: completeTask, taskId: 123, clock: ...}

Client1: SYNC -> No newer events, accepted by server

Client2: { id: 2, action: cancelTask, taskId: 123, clock: ...}

Client2: SYNC -> Newer events detected.

Client2: Fetch latest events

Client2: action id: 1 is older than most recent local action, reconcile

Client2: rollback to action just before id: 1 per total logical clock ordering

Client2: Replay action { id: 1, action: completeTask, taskId: 123, clock: ...}

Client2: Replay action { id: 2, action: cancelTask, taskId: 123, clock: ...} <-- This is running exactly the same application logic as the first cancelTask. It can do whatever you want per app semantics. In this case we'll no-op since transition from completed -> cancelled is not valid.

Client2: SYNC -> no newer actions in remote, accepted

Client1: SYNC -> newer actions in remote, none local, fetch newer actions, apply action { id: 2, action: cancelTask, ...}

At this point client1, client2, and the central DB all have the same consistent state. The task is COMPLETE. Data is consistent and application semantics are preserved.

There's a little more to it than that to handle corner cases and prevent data growth, but that's the gist of it. More details in the repo.

The great thing is that state is reconciled by actually running your business logic functions -- that means that your app always ends up in a valid state. It ends up in the same state it would have ended up in if the app was entirely online and centralized with traditional API calls. Same outcome but works totally offline.

Does that clarify the idea?

You could argue that this would be confusing for Client2 since they set the task to cancelled but it ended up as complete. This isn't any different than a traditional backend api where two users take incompatible actions. The solution is the same, if necessary show an indicator in the UI that some action was not applied as expected because it was no longer valid.

edit: I think I should improve the readme with a written out example like this since it's a bit hard to explain the advantages of this system (or I'm just not thinking of a better way)