Remix.run Logo
andersmurphy 2 days ago

With a trend towards immutable single writer databases MMAP seems like a massive win.

mtndew4brkfst 2 days ago | parent [-]

Andy is very critical of using mmap in database implementations.

hyc_symas 14 hours ago | parent | next [-]

Andy's critiques are only valid on dedicated database servers.

https://www.symas.com/post/are-you-sure-you-want-to-use-mmap...

LMDB uses mmap and Andy recommends LMDB, in the very article this thread is about.

andersmurphy 2 days ago | parent | prev [-]

Why? Sqlite and LMDB make fantastic use of it. For anyone doing a single writer db it's a no brainer. It does so much for you and it does it very well. All the things you don't have to implement because it does it for you:

- Reading the data from disk

- Concurrency between different threads reading the same data

- Caching and buffer management

- Eviction of pages from memory

- Playing nice with other processes in the machine

Why would you not leverage it? It's such a great fit for scaling reads.

hyc_symas 14 hours ago | parent | next [-]

Fun footnote: SQLite only got on board with mmap after I demonstrated how slow their code was without it. I.e., getting a 22x speedup by replacing SQLite's btree code with LMDB https://github.com/LMDB/sqlightning

cmrdporcupine a day ago | parent | prev | next [-]

The strongest argument as far as I can see it is... the problem is you now lose control over all those things. It's a black box with effectively no knobs.

Anyways, read for yourself, Pavlo & Leis get into it in detail, and there's benchmarks:

https://db.cs.cmu.edu/papers/2022/cidr2022-p13-crotty.pdf

https://db.cs.cmu.edu/mmap-cidr2022/

andersmurphy 18 hours ago | parent [-]

What am I missing? The transactional safety problem (the bulk of the paper) is solved simply with a single writer. Which is where you want to be anyway for efficient batching throughput (and isolation).

The other concerns seem to imply there are no other programs running on the same machine as the database. The minute that's not true (is it ever true?). Then OS will do a better job (as seen with LMDB etc).

I think it's telling that the paper focuses on mongoDB not LMDB.

alexpadula 2 days ago | parent | prev [-]

“ It's such a great fit for scaling reads.”

And losing them.

andersmurphy a day ago | parent [-]

How so? LMDB, boltdb/bbolt and sqlite (with mmap) are all rock solid. Just because mongodb used mmap badly does not make it any less valuable.