Remix.run Logo
MatthiasPortzel 4 hours ago

One key thing to understand about TigerBeetle is that it's a file-system-backed database. Static allocation means they limit the number of resources in memory at once (number of connections, number of records that can be returned from a single query, etc). One of the points is that these things are limited in practice anyways (MySQL and Postgres have a simultaneous connection limit, applications should implement pagination). Thinking about and specifying these limits up front is better than having operations time out or OOM. On the other hand, TigerBeetle does not impose any limit on the amount of data that can be stored in the database.

=> https://tigerbeetle.com/blog/2022-10-12-a-database-without-d...

It's always bad to use O(N) memory if you don't have to. With a FS-backed database, you don't have to. (Whether you're using static allocation or not. I work on a Ruby web-app, and we avoid loading N records into memory at once, using fixed-sized batches instead.) Doing allocation up front is just a very nice way of ensuring you've thought about those limits, and making sure you don't slip up, and avoiding the runtime cost of allocations.

This is totally different from OP's situation, where they're implementing an in-memory database. This means that 1) they've had to impose a limit on the number of kv-pairs they store, and 2) they're paying the cost for all kv-pairs at startup. This is only acceptable if you know you have a fixed upper bound on the number of kv-pairs to store.

matklad 3 hours ago | parent | next [-]

Yes, very good point, thanks!

As a tiny nit, TigerBeetle isn't _file system_ backed database, we intentionally limit ourselves to a single "file", and can work with a raw block device or partition, without file system involvement.

fsckboy 2 hours ago | parent [-]

>we intentionally limit ourselves to a single "file", and can work with a raw block device or partition, without file system involvement

those features all go together as one thing. and it's the unix way of accessing block devices (and their interchangeability with streams from the client software perspective)

you're right, it's not the file system.

levkk 4 hours ago | parent | prev [-]

That makes sense. For example, your redis instance will have fixed RAM, so might as well pre-allocate it at boot and avoid fragmentation.

Memcached works similarly (slabs of fixed size), except they are not pre-allocated.

If you're sharing hardware with multiple services, e.g. web, database, cache, the kind of performance this is targeting isn't a priority.