Remix.run Logo
nneonneo 15 hours ago

To be fair, this is with debug symbols. Debug builds of Chrome were in the 5GB range several years ago; no doubt that’s increased since then. I can remember my poor laptop literally running out of RAM during the linking phase due to the sheer size of the object files being linked.

Why are debug symbols so big? For C++, they’ll include detailed type information for every instantiation of every type everywhere in your program, including the types of every field (recursively), method signatures, etc. etc., along with the types and locations of local variables in every method (updated on every spill and move), line number data, etc. etc. for every specialization of every function. This produces a lot of data even for “moderate”-sized projects.

Worse: for C++, you don’t win much through dynamic linking because dynamically linking C++ libraries sucks so hard. Templates defined in header files can’t easily be put in shared libraries; ABI variations mean that dynamic libraries generally have to be updated in sync; and duplication across modules is bound to happen (thanks to inlined functions and templates). A single “stuck” or outdated .so might completely break a deployment too, which is a much worse situation than deploying a single binary (either you get a new version or an old one, not a broken service).

yjftsjthsd-h 15 hours ago | parent | next [-]

Can't debug symbols be shipped as separate files?

loeg 5 hours ago | parent | next [-]

Yes, absolutely. Debuginfo doesn't impact .text section distances either way, though.

bregma 11 hours ago | parent | prev | next [-]

The problem is that when a final binary is linked everything goes into it. Then, after the link step, all the debug information gets stripped out into the separate symbols file. That means at some point during the build the target binary file will contain everything. I can not, for example, build clang in debug mode on my work machine because I have only 32 GB of memory and the OOM killer comes out during the final link phase.

Of course, separate binaries files make no difference at runtime since only the LOAD segments get loaded (by either the kernel or the dynamic loader, depending). The size of a binary on disk has little to do with the size of a binary in memory.

jcelerier 9 hours ago | parent [-]

> The problem is that when a final binary is linked everything goes into it

I don't think that's the case on Linux, when using -gsplit-dwarf the debug info is put in separate files at the object file level, they are never linked into binaries.

yablak 11 hours ago | parent | prev [-]

Yes, but it can be more of a pain keeping track of pairs. In production though, this is what's done. And given a fault, the debug binary can be found in a database and used to gdb the issue given the core. You do have to limit certain online optimizations in order to have useful tracebacks.

This also requires careful tracking of prod builds and their symbol files... A kind of symbol db.

tempay 15 hours ago | parent | prev | next [-]

I’ve seen LLVM dependent builds hit well over 30GB. At that point it started breaking several package managers.

01HNNWZ0MV43FF 15 hours ago | parent | prev [-]

I've hit the same thing in Rust, probably for the same reasons.

Isn't the simple solution to use detached debug files?

I think Windows and Linux both support them. That's how phones like Android and iOS get useful crash reports out of small binaries, they just upload the stack trace and some service like Sentry translates that back into source line numbers. (It's easy to do manually too)

I'm surprised the author didn't mention it first. A 25 GB exe might be 1 GB of code and 24 GB of debug crud.

nicoburns 8 hours ago | parent | next [-]

> Isn't the simple solution to use detached debug files?

It should be. But the tooling for this kind of thing (anything to do with executable formats including debug info and also things like linking and cross-compilation) is generally pretty bad.

dwattttt 11 hours ago | parent | prev [-]

> I think Windows and Linux both support them.

Detached debug files has been the default (only?) option in MS's compiler since at least the 90s.

I'm not sure at what point it became hip to do that around Linux.

kvemkon 3 hours ago | parent [-]

Since at least October 2003 on Debian:

[1] "debhelper: support for split debugging symbols"

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=215670

[2] https://salsa.debian.org/debian/debhelper/-/commit/79411de84...