Remix.run Logo
tomconnors 3 hours ago

Cool stuff as usual, Anders. One of the nice things about running a networked DB is that it makes redeploying the application a bit simpler. You can spin up a new EC2 instance or whatever and once it's online kill the old one. That gets 0 or close to 0 downtime. If the DB is on the same instance, replacing it requires loading up the DB onto the new instance, which seems more error prone than just restarting the app on the original instance, but in my experience that typically incurs downtime or some complicated handoff logic. Have you had to deal with anything like that running sqlite in prd?

andersmurphy 3 hours ago | parent [-]

Oh there are a bunch of considerations.

You're going to want persistent storage on your server, not ephemeral. You'll also want NVME. A lot of the time you're going to end up on bare metal running a single server anyway.

You're going to have down time for migrations unless you're very clever with your schema and/or replicas.

Litestream for me at least is what makes SQLite viable for a web app as prior to that there wasn't a good replication story.

With litestream it's much easier to have a backup on standby. That being said where I have used it in production some amount of downtime has been acceptable so mileage may vary.

kiitos 2 hours ago | parent [-]

> You're going to have down time for migrations unless you're very clever with your schema and/or replicas.

probably worth stating these kinds of design considerations/assumptions up-front

i'm sure lots of applications are fine with "downtime for [database] migrations" but lots more are definitely not, especially those interested in synthetic metrics like TPS

andersmurphy an hour ago | parent [-]

I'd argue the opposite most applications are fine with an hour of downtime a month and arguably much more downtime then that. The recent AWS and Cloudflare outages have proven that.

You can achieve zero downtime with Sqlite if you really need to.

TPS is not a synthetic metric when you cap out at 100 TPS because of Amdahl's law and your users having a power distribution.