Remix.run Logo
toast0 3 days ago

Overcommit is subtle. If you allocate a bunch of address space and don't touch it, that's one thing.

If you allocate and touch everything, and then try to allocate more, it's better to get an allocation error than an unsatifyable page fault later.

My understanding (which could very well be wrong) is Linux overcommit will continue to allocate address space when asked regardless of memory pressure; but FreeBSD overcommit will refuse allocations when there's too much memory pressure.

I'm pretty sure I've seen FreeBSD's OOM killer, but it needs a specific pattern of memory use, it's much more likely for an application to get a failed allocation and exit, freeing memory; than for all the applications to have unused allocations that they then use.

All that said, I prefer to run with a small swap, somewhere around 0.5-2GB. Memory pressure is hard to measure (although recent linux has a measure that I haven't used), but swap % and swap i/o are easy to measure. If your swap grows quickly, you might not have time to do any operations to fix it, but your stats should tell the tale. If your swap grows slowly enough, you can set thresholds and analyze the situation. If you have a lot of swap i/o that provides a measure of urgency.

jcalvinowens 3 days ago | parent [-]

> If you allocate and touch everything, and then try to allocate more, it's better to get an allocation error than an unsatifyable page fault later.

It depends, but generally speaking I'd disagree with that.

The only time you actually want to see the allocation failures is if you're writing high reliability software where you've gone to the trouble to guarantee some sort of meaningful forward progress when memory is exhausted. That is VERY VERY hard, and quickly becomes impossible when you have non-trivial library dependencies.

If all you do is raise std::bad_alloc or call abort(), handling NULL return from malloc() is arguably a waste of icache: just let it crash. Dereferencing NULL is guaranteed to crash on Linux, only root can mmap() the lowest page.

Admittedly I'm anal, and I write the explicit code to check for it and call abort(), but I know very experienced programmers I respect who don't.

toast0 3 days ago | parent [-]

> If all you do is raise std::bad_alloc or call abort(), handling NULL return from malloc() is arguably a waste of icache: just let it crash. Dereferencing NULL is guaranteed to crash on Linux, only root can mmap() the lowest page.

If you don't care to handle the error, which is a totally reasonable position, there's not a whole lot of difference between the allocator returning a pointer that will make you crash on use because it's zero, and a pointer that will make you crash on use because there are no pages available. There is some difference because if you get the allocation while there are no pages available, the fallible allocator has returned a permanently dead pointer and the unfailing allocator has returned a pointer that can work in the future.

But if you do want to respond to errors, it is easier to respond to a NULL return rather than to a failed page fault. I certainly agree it's not easy to do much other than abort in most cases, but I'd rather have the opportunity to try.

jcalvinowens 2 days ago | parent [-]

> But if you do want to respond to errors, it is easier to respond to a NULL return rather than to a failed page fault.

It's just inherently incompatible with overcommit, isn't it? Like you can mmap() directly and use MAP_POPULATE|MAP_LOCKED to get what you want*, but that defeats overcommit entirely.

I guess I can imagine a syscall that takes a pointer and says "fault this page please but return an error instead of killing me if you can't", but there's an unavoidable TOCTOU problem in that it could be paged out again before you actually touch it.

A zany idea is to write a custom malloc() that uses userfaultfd to allow overcommit in userspace with it disabled in the kernel. The benefit being that userspace gets to decide what to do if a fault can't be satisfied instead of getting killed. But that would be pretty complex, and I don't know what the performance would look like.

* EDIT: Actually the manpage implies some ambiguity about whether MAP_LOCKED|MAP_POPULATE is guaranteed to avoid the first major fault, it might need mmap()+mlock(), I'd have to look more carefully...

toast0 a day ago | parent [-]

> It's just inherently incompatible with overcommit, isn't it?

It's true that if overcommit is enabled, you can't guarantee you won't end up with a page fault that can't be satisfied.

But my experience on FreeBSD, which has overcommit enabled by default and returns NULL when asked for allocations that can't be (currently) satisfied is that most of the time you get a NULL allocation rather than an unsatisfied page fault.

What typically happens is a program grows to use beyond available memory (and swap) and it does so by allocating large, but managable chunks, using them, and then repeating. At a certain point, the OS struggles, but is typically able to find a page for each fault, but the large allocation looks too big, and the allocation fails and the program aborts.

But sometimes a program changes its usage pattern and starts using allocations that had been unused. In that case, you can still trigger the fatal page faults, because overcommit let you allocate more than is there.

If you don't want to have both scenarios, you can choose to eliminate the possibility of NULL by strictly allowing all allocations (although you could run out of address space and get a NULL at that point) or you can choose to eliminate the possibility of an unsatisfied page fault by strictly disallowing overcommit. I prefer having NULL when possible, and unsatisfied page faults when not.