Remix.run Logo
inkyoto 2 days ago

> […] because after the fork writes to basically any page in either process will trigger memory commitment.

This is largely not true for most processes. For a child process to start writing into its own data pages en masse, there has to exist a specific code path that causes such behaviour. Processes do not randomly modify their own data space – it is either a bug or a peculiar workload that causes it.

You would have a stronger case if you mentioned, e.g., the JVM, which has a high complexity garbage collector (rather, multiple types of garbage collectors – each with its own behaviour), but the JVM ameliorates the problem by attempting to lock in the entire heap size at startup or bailing if it fails to do so.

In most scenarios, forking a process has a negligible effect on the overall memory consumption in the system.

minitech 2 days ago | parent [-]

> This is largely not true for most processes.

> In most scenarios, forking a process has a negligible effect on the overall memory consumption in the system.

Yes, that’s what they’re getting at. It’s good overcommitment. It’s still overcommitment, because the OS has no way of knowing whether the process has the kind of rare path you’re talking about for the purposes of memory accounting. They said that disabling overcommit is wasteful, not that fork is wasteful.

barchar 2 days ago | parent | next [-]

Yep. If you aren't overcommitting on fork it's quite wasteful, and if you are overcommitting on fork then you've already given up on not having to handle oom conditions after malloc has returned.

inkyoto a day ago | parent | prev [-]

> […] the purposes of memory accounting.

This is a crucial distinction and I agree when the problem is framed this way.

The original statement by another GP, however, was that fork(2) is wasteful (it is not).

In fact, I have mentioned it in a sister thread that the OS does not have a way to know of the kind of behaviour the parent or the child will exhibit after forking[0].

Generally speaking, this is in line with the foundational ethos of the UNIX philosophy where UNIX gives its users a wide array of tools tantamount to shotguns that shoot both forward and backward simultaneously and the responsibility for with the number of deaths and permanent maimings ultimately lies with its users. In comparison, memory management in operating systems that run mainframes is substantially more complex and sophisticated.

[0] In a separate thread, somebody else has mentioned a valid reverse scenario where the child idles by after forking and it is the parent that makes its data pages dirty causing the physical memory consumption to baloon.