▲ | conradev 2 days ago | ||||||||||||||||||||||
> After a client pulls a graft, it knows exactly what’s changed. It can use that information to determine precisely which pages are still valid and which pages need to be fetched Curious how this compares to Cloud-Backed SQLite’s manifest: https://sqlite.org/cloudsqlite/doc/trunk/www/index.wiki It’s similar to your design (sending changed pages), but doesn’t need any compute on the server, which I think is a huge win. | |||||||||||||||||||||||
▲ | carlsverre 2 days ago | parent | next [-] | ||||||||||||||||||||||
Thanks for bringing that up! Cloud-Backed SQLite (CBS) is an awesome project and perhaps even more importantly a lot more mature than Graft. But here is my overview of what's different: CBS uses manifests and blocks as you point out. This allows readers to pull a manifest and know which blocks can be reused and which need to be pulled. So from that perspective it's very similar. The write layer is pretty different, mainly because CBS writes blocks directly from the client, while Graft leverages an intermediate PageStore to handle persistence. The first benefit of using a middleman is that the PageStore is able to collate changes from many Volumes into larger segments in S3, and soon will compact and optimize those segments over time to improve query performance and eliminate tombstones. The second benefit is fairly unique to Graft, and that is that the written pages are "floating" until they are pinned into a LSN by committing to the MetaStore. This matters when write concurrency increases. If a client's commit is rejected (it wasn't based on the last snapshot), it may attempt to rebase its local changes on the latest snapshot. When it does so, Graft's model allows it to reuse any subset of its previously attempted commit in the new commit, in the best case completely eliminating any additional page uploads. I'm excited to experiment with using this to dramatically improve concurrency for non-overlapping workloads. The third benefit is permissions. When you roll out Graft, you are able to enforce granular write permissions in the PageStore and MetaStore. In comparison, CBS requires clients to have direct access to blob storage. This might work in a server side deployment, but isn't suited to edge and device use cases where you'd like to embed replicas in the application. On the manifest side of the equation, while in CBS it's true that a client can simply pull the latest manifest, when you scale up to many clients and high change workload, Graft's compressed bitset approach dramatically reduces how much data clients need to pull. You can think of this as pulling a log vs a snapshot, except for metadata. Hope that helps clarify the differences! Oh, and one more petty detail: I really like Rust. :) | |||||||||||||||||||||||
| |||||||||||||||||||||||
▲ | hamandcheese 2 days ago | parent | prev [-] | ||||||||||||||||||||||
Woah, hadn't seen this before but this is really cool! I was recently looking for a way to do low scale serverless db in gcloud, this might be better than any of their actual offerings. Cloud firestore seems like the obvious choice, but I couldn't figure out a way to make it work with existing gcloud credentials that are ubiquitous in our dev and CI environments. Maybe a skill issue. |