Remix.run Logo
herpderperator 9 hours ago

Does this help with DuckDB concurrency? My main gripe with DuckDB is that you can't write to it from multiple processes at the same time. If you open the database in write mode with one process, you cannot modify it at all from another process without the first process completely releasing it. In fact, you cannot even read from it from another process in this scenario.

So if you typically use a file-backed DuckDB database in one process and want to quickly modify something in that database using the DuckDB CLI (like you might connect SequelPro or DBeaver to make changes to a DB while your main application is 'using' it), then it complains that it's locked by another process and doesn't let you connect to it at all.

This is unlike SQLite, which supports and handles this in a thread-safe manner out of the box. I know it's DuckDB's explicit design decision[0], but it would be amazing if DuckDB could behave more like SQLite when it comes to this sort of thing. DuckDB has incredible quality-of-life improvements with many extra types and functions supported, not to mention all the SQL dialect enhancements allowing you to type much more concise SQL (they call it "Friendly SQL"), which executes super efficiently too.

[0] https://duckdb.org/docs/current/connect/concurrency

szarnyasg 9 hours ago | parent | next [-]

Hi, DuckDB DevRel here. To have concurrent read-write access to a database, you can use our DuckLake lakehouse format and coordinate concurrent access through a shared Postgres catalog. We released v1.0 yesterday: https://ducklake.select/2026/04/13/ducklake-10/

I updated your reference [0] with this information.

nrjames 5 hours ago | parent | next [-]

Regarding documentation, I think the DuckLake docs would benefit from a relatively simple “When should I consider using DuckLake?” type FAQ entry. You have sections for what, how, and why, essentially, and a few simple use cases and/or case studies could help provide the aha moment to people in data jobs who are inundated with marketing from other companies. It would help folks like me understand under which circumstances I would stand to benefit most from using DuckLake.

citguru 2 hours ago | parent | prev [-]

Hi,

DuckLake is great for the lakehouse layer and it's what we use in production. But there's a gap and thats what I'm trying to address with OpenDuck. DuckLake do solve concurrent access at the lakehouse/catalog level and table management.

But the moment you need to fall back to DuckDB's own compute for things DuckLake doesn't support yet, you're back to a single .duckdb file with exclusive locking. One process writes, nobody else reads.

OpenDuck sits at a different layer. It intercepts DuckDB's file I/O and replaces it with a differential storage engine which is append-only layers with snapshot isolation.

4 hours ago | parent | prev | next [-]
[deleted]
jeadie 6 hours ago | parent | prev | next [-]

This is exactly what we found. Ingest rates were tough. We partitioned and ran over multiple duckdb instances too (and wrangled the complexity).

We ending up building a Sqlite + vortex file alternative for our use case: https://spice.ai/blog/introducing-spice-cayenne-data-acceler...

wenc 4 hours ago | parent | prev | next [-]

Try DuckLake. They just released a prod version.

You can do read/write of a parquet folder on your local drive, but managed by DuckLake. Supports schema evolution and versioning too.

Basically SQLite for parquet.

citguru 3 hours ago | parent | prev [-]

[dead]