Remix.run Logo
kopirgan 3 days ago

As a backend database that's not multi user, how many web connections that do writes can it realistically handle? Assuming writes are small say 100+ rows each?

Any mitigation strategy for larger use cases?

Thanks in advance!

loxs 3 days ago | parent | next [-]

After 2 years in production with a small (but write heavy) web service... it's a mixed bag. It definitely does the job, but not having a DB server does have not only benefits, but also drawbacks. The biggest being (lack of) caching the file/DB in RAM. As a result I have to do my own read caching, which is fine in Rust using the mokka caching library, but it's still something you have to do yourself, which would otherwise come for free with Postgres. This of course also makes it impossible to share the cache between instances, doing so would require employing redis/memcached at which point it would be better to use Postgres.

It has been OK so far, but definitely I will have to migrate to Postgres at one point, rather sooner than later.

TekMol 3 days ago | parent | next [-]

How would caching on the db layer help with your web service?

In my experience, caching makes most sense on the CDN layer. Which not only caches the DB requests but the result of the rendering and everything else. So most requests do not even hit your server. And those that do need fresh data anyhow.

loxs 2 days ago | parent [-]

As I said, my app is write heavy. So there are several separate processes that constantly write to the database, but of course, often, before writing, they need to read in order to decide what/where to write. Currently they need to have their own read cache in order to not clog the database.

The "web service" is only the user facing part which bears the least load. Read caching is useful there too as users look at statistics, so calculating them once every 5-10 minutes and caching them is needed, as that requires scanning the whole database.

A CDN is something I don't even have. It's not needed for the amount of users I have.

If I was using Postgres, these writer processes + the web service would share the same read cache for free (coming from Posgres itself). The difference wouldn't be huge if I would migrate right now, but now I already have the custom caching.

kopirgan 2 days ago | parent | prev [-]

I am no expert, but SQLite does have in memory store? At least for tables that need it..ofc sync of the writes to this store may need more work.

WJW 3 days ago | parent | prev | next [-]

Couple thousand simultaneous should be fine, depending on total system load, whether you're running on spinning disks or on SSDs, p50/99 latency demands and of course you'd need to enable the WAL pragma to allow simultaneous writes in the first place. Run an experiment to be sure about your specific situation.

laurencerowe 2 days ago | parent [-]

You also need BEGIN CONCURRENT to allow simultaneous write transactions.

https://www.sqlite.org/src/doc/begin-concurrent/doc/begin_co...

TekMol 3 days ago | parent | prev [-]

Why have multiple connections in the first place?

If your writes are fast, doing them serially does not cause anyone to wait.

How often does the typical user write to the DB? Often it is like once per day or so (for example on hacker news). Say the write takes 1/1000s. Then you can serve

    1000 * 60 * 60 * 24 = 86 million users
And nobody has to wait longer than a second when they hit the "reply" button, as I do now ...
frje1400 3 days ago | parent | next [-]

> If your writes are fast, doing them serially does not cause anyone to wait.

Why impose such a limitation on your system when you don't have to by using some other database actually designed for multi user systems (Postgres, MySQL, etc)?

TekMol 3 days ago | parent [-]

Because development and maintenance faster and easier to reason about. Increasing the chances you really get to 86 million daily active users.

frje1400 2 days ago | parent [-]

So in this solution, you run the backend on a single node that reads/writes from an SQLite file, and that is the entire system?

withinboredom 2 days ago | parent [-]

Thats basically how the web started. You can serve a ridiculous number of users from a single physical machine. It isn't until you get into the hundreds-of-millions of users ballpark where you need to actually create architecture. The "cloud" lets you rent a small part of a physical machine, so it actually feels like you need more machines than you do. But a modern server? Easily 16-32+ cores, 128+gb of ram, and hundreds of tb of space. All for less than 2k per month (amortized). Yeah, you need an actual (small) team of people to manage that; but that will get you so far that it is utterly ridiculous.

Assuming you can accept 99% uptime (that's ~3 days a year being down), and if you were on a single cloud in 2025; that's basically last year.

kopirgan 2 days ago | parent [-]

I agree...there is scale and then there is scale. And then there is scale like Facebook.

We need not assume internet FB level scale for typical biz apps where one instance may support a few hundred users max. Or even few thousand. Over engineering under such assumptions is likely cost ineffective and may even increase surface area of risk. $0.02

downsplat a day ago | parent [-]

It goes much further than that.. a single moderately sized VPS web server can handle millions of hard-to-cache requests per day, all hitting the db.

Most will want to use a managed db, but for a real basic setup you can just run postgres or mysql on the same box. And running your own db on a separate VPS is not hard either.

kopirgan 2 days ago | parent | prev | next [-]

That depends on the use case. HN is not a good example. I am referring to business applications where users submit data. Ofc in these cases we are looking at 00s not millions of users. The answer is good enough.

nijave 2 days ago | parent | prev [-]

>How often does the typical user write to the DB

Turns out a lot when you have things like "last accessed" timestamps on your models.

Really depends on the app

I also don't think that calculation is valid. Your users aren't going to be purely uniformly accessing the app over the course of a day. Invariably you'll have queuing delays above a significantly smaller user count (but maybe the delays are acceptable)