Remix.run Logo
lelandbatey 3 days ago

That's a very appealing approach from a developer ergonomics perspective; it'd be very nice to only have to deploy your own application and not also deploy a coordinator.

You mention that you don't have to rewrite your application to work around how DBOS operates. That seems somewhat true, but I think DBOS still requires folks to rewrite their code around a custom runtime. Looking at the Python code on your home page, it seems like you're leveraging Python's decorators to make the "glue code" less prominent (registering functions with the async executor, telling the async system to invoke certain registered functions), but the glue code is still there. If I go look at the DBOS library for Golang[1] for example, since Golang doesn't have decorators in the same way Python does, we still have to have code doing the kind of "manual callback" style I mentioned:

    // code is massively paraphrased for brevity, err checks removed
    func workflow(dbosCtx dbos.DBOSContext, _ string) (string, error) {
        _, err := dbos.RunAsStep(dbosCtx, func(ctx) (string, error) { return stepOne(ctx) })
        return dbos.RunAsStep(dbosCtx, func(ctx) (string, error) { return stepTwo(ctx) })
    }
    func main() {
        // Initialize a DBOS context
        dctx, err := dbos.NewDBOSContext(dbos.Config{ DatabaseURL: "...", AppName: "myapp", })
        // Register a workflow
        dbos.RegisterWorkflow(dctx, workflow)

        // Launch DBOS
        err = dctx.Launch()
        defer dctx.Cancel()

        // Run a durable workflow and get its result
        handle, err := dbos.RunWorkflow(dctx, workflow, "")
        res, err := handle.GetResult()
        fmt.Println("Workflow result:", res)
    }
I don't think that's a bad thing though, I think that's a good thing. I feel like positioning DBOS as a _library_ is an excellent choice, it's a huge ergonomics improvement. The choices so far seem like you're trying to make DBOS easy to adopt via appropriate amounts of convenience features, but not so much automagic that we-the-devs can't reason about what's going on. With developer reasoning in mind, I have some more questions for you!

In the architecture page you linked[2], you talk about versioning. Versioning with durable workflows is one of those super-annoying things which affect the entire paradigm, albiet only once you've already adopted the tech and start having to change/evolve/maintain workflows. In that doc, you say that with DBOS each application will only work on workflows started by application versions which match the current application version. For completing long-running workflows, the page says:

> To safely recover workflows started on an older version of your code, you should start a process running that code version.

Since one of the killer apps of durable workflows are, as I mentioned, typically long-running jobs, do you have any products/advice/documentation for this pattern of running multiple application versions, and how one might approach implementing this practice? If we're writing code which takes a week to complete and may exit and recover many times before finally completing, do you have advice on how to keep each version deployed till all the work for a version is completed? Looking at Temporal, when using their Worker versioning scheme they offer ways for users to look this information up in Temporal, but not much guidance on actually implementing the pattern. Looking at the DBOS docs about versioning, I see information about getting this information via e.g. Conductor, but I also do not see any info about actually implementing multiple-concurrent-worker-version deployment (which Temporal calls "rainbow deployments"). Is version management something y'all are thinking about improving the ergonomics of, in the same way you improved ergonomics by bringing the executor in-process?

Speaking about versioning, how does DBOS handle cases around bugfix versions? Where you deploy a version A, but A has a bug in it. You would like to make the fix and deploy that fix as version B, then ideally, run the remaining workflows for Version A using the code in Version B. It seems like "version forking"[3] is the only way to do this, but it also seems like it's a special operation that cannot be done via a code change; it must be done via the Conductor administration UI. Is there no way to do in-code version patching[4] like is done in Temporal?

Finally, what are the limits to usage of DBOS? As in, where does DBOS start to fall down? Are there guidelines on the maximum number of steps in a workflow before things start to get tricky? What about the maximum serialized size of the workflow/step parameters? I've been unable to find any of that information on your website.

Thanks for making such an interesting piece of technology, and thanks for answering questions!

[1] - https://github.com/dbos-inc/dbos-transact-golang

[2] - https://docs.dbos.dev/architecture

[3] - https://docs.dbos.dev/production/self-hosting/workflow-manag...

[4] - https://docs.temporal.io/develop/go/versioning#patching

qianli_cs 3 days ago | parent [-]

Those are great questions!

For versioning, we recommend keeping each version running until all workflows on that version are done. It's similar to a blue-green deployment: each process is tagged with one version, and all workflows in it share that version. You can list pending/enqueued workflows on the old version (UI or list_workflow programmatic API), and once that list drains, you can shut down the old processes. DBOS Cloud automates this, and we'll add more guidance for self-hosting.

For bugfixes, DBOS supports programmatic forking and other workflow management tools [1]. We deliberately don't support code patching because it's fragile and hard to test. For example, patches can pile up on long-running workflows and make debugging painful.

The main limit is the database (which you can control the size). DBOS writes workflow inputs, step outputs, and workflow outputs to it. There's no step limit beyond disk space. Postgres/SQLite allow up to 1 GB per field, but keeping inputs/outputs under ~2 MB helps performance. We'll add clearer guidelines to the docs.

Thanks again for all the thoughtful questions!

[1] https://docs.dbos.dev/python/reference/contexts#fork_workflo...