Remix.run Logo
hparadiz 2 hours ago

Maybe tangentially related but I always think it's silly that every linux process has the same libgcc_so.so.1 loaded into memory for each process even though the raw binary for the library is exactly the same so you end up with like 800 copies of libgcc_so.so.1 in memory.

I mean maybe this has been optimized for already and I don't know what I'm talking about but maybe someone with more knowledge about the kernel knows? Is this something we simply can't optimize for because of security implications?

201984 2 hours ago | parent | next [-]

Shared libraries (and mmapped files in general) are deduplicated; it's nowhere near as bad as you think. The kernel loads a .so into memory once and then maps that memory into every process that mmaps it.

Editing to add: this deduplication is one of the greatest upsides to dynamic linking. Common libs like libgcc and libc only have to exist in memory once and can stay in CPU caches, whereas if they were statically linked into every binary, each binary would have a copy of that library that wouldn't be shared with anything else and you'd waste a lot of memory.

sjmulder an hour ago | parent [-]

Doesn't the loaded code have to be patched for relocations?

ptspts an hour ago | parent | next [-]

It does, so not 100% is reused. The patched parts are in different sections though, so the entire .text (code) section ends up being reused.

monocasa an hour ago | parent | prev | next [-]

Not on modern archs that provide decent support for PIE (position independent executables).

201984 21 minutes ago | parent [-]

How do you think position independent code can call functions from other .so's without being patched with their addresses?

They can't, so even PIC code still has to have a relocation table that gets patched. It's in a different page than the code though, so code does still get reused.

t-3 an hour ago | parent | prev [-]

Not if it's position-independent.

saidinesh5 2 hours ago | parent | prev | next [-]

Typically libgcc_so.so is loaded by the linker, which uses an mmap call to map the binary into the address space.

> The kernel keeps track of which file is mapped where, and can detect when a request is made to map an already mapped file again, avoiding physical memory allocation if possible.

Relevant stack overflow answer: https://stackoverflow.com/questions/61950951/linux-shared-li...

mlaretallack 2 hours ago | parent | prev | next [-]

In Linux, when a shared lib is loaded by multiple processes, its loaded once and not duplicated in ram. Only if a memory page is modified by the process will the memory be duplicated. (Hope I have explained that correctly)

monocasa 2 hours ago | parent | prev | next [-]

Those mappings by default all go to the same shared memory.

Unices have been sharing executable memory between processes longer than there's been mmap for user space to do the same thing themselves. I remember seeing it in the 2BSD kernel for instance.

2 hours ago | parent | prev | next [-]
[deleted]
BoingBoomTschak 2 hours ago | parent | prev | next [-]

Eh? Aren't shared libraries actually shared in memory?

1718627440 an hour ago | parent [-]

Yeah, that's kind of the point.

sirsinsalot 2 hours ago | parent | prev [-]

I have a rule for myself. If I think something is silly or stupid, I assume I don't understand it. I usually find I do not understand it, and it no longer seems silly when I do understand it.

In this case too, you think it is silly because you don't understand it. Your assumptions are wrong, making it seem silly.