Remix.run Logo
bsaul 7 hours ago

i don't understand the last point of UDF. Either you need the state to be updated atomically across different systems or you don't. But writing a row in a system in order to update the second one at any random time in the future isn't really much different from enqueuing a job in queue.

mrkeen 6 hours ago | parent | next [-]

Your intuition sounds right to me.

This sounds a lot like reinventing a message queue. Someone trying this in the future might learn painful lessons about ordering, commits, partitioning, dead-letter-queues, replayability, don't-call-me-I'll-call-you, and anything else a Kafka-like comes with out of the box.

KraftyOne 7 hours ago | parent | prev [-]

The key is that the UDF's enqueue is transactional with the database update. Let's say the database update is inserting a new order. This provides the guarantee that if a new order is inserted, a job to process the order is also enqueued. It's impossible for a new order to be inserted without its processing job also being enqueued. Then the durable workflow/queue system is responsible for making sure the processing job, once enqueued, actually executes.

LgWoodenBadger 5 hours ago | parent [-]

And if that job never runs? Or if that job runs and then fails to commit that it ran in postgres?

halfcat 4 hours ago | parent [-]

The job will run the next time a worker runs (in both cases).

And doesn’t that mean the job potentially runs twice? Yes.

In DBOS there are two kinds of “things that run”: workflows, and steps (workflows are made of steps).

Workflows must be deterministic (so it’s fine if it runs twice). Steps don’t have to be deterministic but have at-least-once execution (so it’s best if these are idempotent).