Remix.run Logo
ori_b 19 hours ago

To quote a friend; "Glibc is a waste of a perfectly good stable kernel ABI"

derefr 15 hours ago | parent | next [-]

Kind of funny to realize, the NT kernel ABI isn’t even all that stable itself; it is just wrapped in a set of very stable userland exposures (Win32, UWP, etc.), and it’s those exposures that Windows executables are relying on. A theoretical Windows PE binary that was 100% statically linked (and so directly contained NT syscalls) wouldn’t be at-all portable between different Windows versions.

Linux with glibc is the complete opposite; there really does exist old Linux software that static-links in everything down to libc, just interacting with the kernel through syscalls—and it does (almost always) still work to run such software on a modern Linux, even when the software is 10-20 years old.

I guess this is why Linux containers are such a thing: you’re taking a dynamically-linked Linux binary and pinning it to a particular entire userland, such that when you run the old software, it calls into the old glibc. Containers work, because they ultimately ground out in the same set of stable kernel ABI calls.

(Which, now that I think of it, makes me wonder how exactly Windows containers work. I’m guessing each one brings its own NTOSKRNL, that gets spun up under HyperV if the host kernel ABI doesn’t match the guest?)

easton 13 hours ago | parent | next [-]

IIRC, Windows containers require that the container be built with a base image that matches the host for it to work at all (like, the exact build of Windows has to match). Guessing that’s how they get a ‘stable ABI’.

…actually, looks like it’s a bit looser these days. Version matrix incoming: https://learn.microsoft.com/en-us/virtualization/windowscont...

my123 11 hours ago | parent [-]

The ABI was stabilised for backwards compatibility since Windows Server 2022, but is not stable for earlier releases.

senfiaj 15 hours ago | parent | prev | next [-]

> Kind of funny to realize, the NT kernel ABI isn’t even all that stable itself

This is not a big problem if it's hard/unlikely enough to write a code that accidentally relies on raw syscalls. At least MS's dev tooling doesn't provide an easy way to bypass the standard DLLs.

> makes me wonder how exactly Windows containers work

I guess containers do the syscalls through the standard Windows DLLs like any regular userspace application. If it's a Linux container on Windows, probably the WSL syscalls, which I guess, are stable.

sedatk 12 hours ago | parent | prev | next [-]

> NT kernel ABI isn’t even all that stable itself

Can you give an example where a breaking change was introduced in NT kernel ABI?

andrewf 9 hours ago | parent | next [-]

https://j00ru.vexillium.org/syscalls/nt/64/

(One example: hit "Show" on the table header for Win11, then use the form at the top of the page to highlight syscall 8c)

sedatk 8 hours ago | parent [-]

Changes in syscall numbers aren't necessarily breaking changes as you're supposed to use ntdll.dll to call kernel, not direct syscalls.

LeFantome 3 hours ago | parent | next [-]

That was his point exactly.

sedatk 2 hours ago | parent [-]

Nope. https://news.ycombinator.com/item?id=46440942

6 hours ago | parent | prev [-]
[deleted]
mrpippy 9 hours ago | parent | prev [-]

The syscall numbers change with every release: https://j00ru.vexillium.org/syscalls/nt/64/

sedatk 8 hours ago | parent [-]

Syscall numbers shouldn't be a problem if you link against ntdll.dll.

immibis an hour ago | parent | next [-]

So now you're talking about the ntdll.dll ABI instead of the kernel ABI. ntdll.dll is not the kernel.

MangoToupe 8 hours ago | parent | prev [-]

...isn't that the point of this entire subthread? The kernel itself doesn't provide the stable ABI, userland code that the binary links to does.

sedatk 7 hours ago | parent [-]

No. On NT, kernel ABI isn't defined by the syscalls but NTDLL. Win32 and all other APIs are wrappers on top of NTDLL, not syscalls. Syscalls are how NTDLL implements kernel calls behind the scenes, it's an implementation detail. Original point of the thread was about Win32, UWP and other APIs that build a new layer on top of NTDLL.

I argue that NT doesn't break its kernel ABI.

roytam87 3 hours ago | parent | next [-]

NTDLL APIs are very stable[0] and you can even compile and run x86 programs targeting NT 3.1 Build 340[1] which will still work on win11.

[0] as long as you don't use APIs they decided to add and remove in a very short period (longer read: https://virtuallyfun.com/2009/09/28/microsoft-fortran-powers...)

[1] https://github.com/roytam1/ntldd/releases/tag/v250831

KerrAvon 7 hours ago | parent | prev [-]

macOS and iOS too — syscalls aren’t stable at all, you’re expected to link through shared library interfaces.

dist-epoch 13 hours ago | parent | prev | next [-]

Apparently there are 3 kinds of Windows containers, one using HyperV, and the others sharing the kernel (like Linux containers)

https://thomasvanlaere.com/posts/2021/06/exploring-windows-c...

Zardoz84 15 hours ago | parent | prev [-]

Docker on windows isn't simply a glorified virtual machine running a Linux. aka Linux subsystem v2

microtonal 16 hours ago | parent | prev | next [-]

At least glibc uses versioned symbols. Hundreds of other widely-used open source libraries don't.

ok123456 15 hours ago | parent | next [-]

Versioned glibc symbols are part of the reason that binaries aren't portable across Linux distributions and time.

ben-schaaf 14 hours ago | parent [-]

Only because people aren't putting in the effort to build their binaries properly. You need to link against the oldest glibc version that has all the symbols you need, and then your binary will actually work everywhere(*).

* Except for non-glibc distributions of course.

chrismorgan 15 minutes ago | parent | next [-]

I don’t understand why this is the case, and would like to understand. If I want only functions f1 and f2 which were introduced in glibc versions v1 and v2, why do I have to build with v2 rather than v3? Shouldn’t the symbols be named something like glibc_v1_f1 and glibc_v2_f2 regardless of whether you’re compiling against glibc v2 or glibc v3? If it is instead something like “compiling against vN uses symbols glibc_vN_f1 and glibc_vN_f2” combined with glibc v3 providing glibc_v1_f1, glibc_v2_f1, glibc_v3_f1, glibc_v2_f2 and glbc_v3_f2… why would it be that way?

LegionMammal978 9 hours ago | parent | prev | next [-]

But to link against an old glibc version, you need to compile on an old distro, on a VM. And you'll have a rough time if some part of the build depends on a tool too new for your VM. It would be infinitely simpler if one could simply 'cross-compile' down to older symbol versions, but the tooling does not make this easy at all.

jhasse 2 hours ago | parent | next [-]

It's actually doable without an old glibc as it was done by the Autopackage project: https://github.com/DeaDBeeF-Player/apbuild

That never took off though, containers are easier. Wirh distrobox and other tools this is quite easy, too.

nineteen999 3 hours ago | parent | prev [-]

Huh? Bullshit. You could totally compile and link in a container.

LeFantome 3 hours ago | parent [-]

Ok, so you agree with him except where he says “in a VM” because you say you can also do it “in a container”.

Of course, you both leave out that you could do it “on real hardware”.

But none of this matters. The real point is that you have to compile on an old distro. If he left out “in a VM”, you would have had nothing to correct.

nineteen999 3 hours ago | parent [-]

I'm not disagreeing that glibc symbol versioning could be better. I raised it because this is probably one of the few valid use cases for containers where they would have a large advantage over a heavyweight VM.

But it's like complaining that you might need a VM or container to compile your software for Win16 or Win32s. Nobody is using those anymore. Nor really old Linux distributions. And if they do, they're not really going to complain about having to use a VM or container.

As C/C++ programmer, the thing I notice is ... the people who complain about this most loudly are the web dev crowd who don't speak C/C++, when some ancient game doesn't work on their obscure Arch/Gentoo/Ubuntu distribution and they don't know how to fix it. Boo hoo.

But they'll happily take a paycheck for writing a bunch of shit Go/Ruby/PHP code that runs on Linux 24/7 without downtime - not because of the quality of their code, but due to the reliability of the platform at _that_ particular task. Go figure.

Rohansi 39 minutes ago | parent [-]

> But they'll happily take a paycheck for writing a bunch of shit Go/Ruby/PHP code that runs on Linux 24/7 without downtime - not because of the quality of their code, but due to the reliability of the platform at _that_ particular task.

But does the lack of a stable ABI have any (negative) effect on the reliability of the platform?

TUSF 39 minutes ago | parent | prev | next [-]

> You need to link against the oldest glibc version that has all the symbols you need

Or at least the oldest one made before glibc's latest backwards incompatible ABI break.

ok123456 14 hours ago | parent | prev | next [-]

If it requires effort to be correct, that's a bad design.

Why doesn't the glibc use the version tag to do the appropriate mapping?

mikkupikku 12 hours ago | parent [-]

I think even calling it a "design" is dubious. It's an attribute of these systems that arose out of the circumstance, nobody ever sat down and said it should be this way. Even Torvalds complaining about it doesn't mean it gets fixed, it's not analogous to Steve Jobs complaining about a thing because Torvalds is only in charge of one piece of the puzzle, and the whole image that emerges from all these different groups only loosely collaborating with each other isn't going to be anybody's ideal.

In other words, the Linux desktop as a whole is a Bazaar, not Cathedral.

forrestthewoods 6 hours ago | parent | prev [-]

> Only because people aren't putting in the effort to build their binaries properly.

Because Linux userland is an unmitigated clusterfuck of bad design that makes this really really really hard.

GCC/Clang and Glibc make it effectively impossible almost impossible to do this on their own. The only way you can actually do this is:

1. create a userland container from the past 2. use Zig which moved oceans and mountains to make it somewhat tractable

It's awful.

grishka 11 hours ago | parent | prev | next [-]

Yeah and nothing ever lets you pick which versions to link to. You're going to get the latest ones and you better enjoy that. I found it out the hard way recently when I just wanted to do a perfectly normal thing of distributing precompiled binaries for my project. Ended up using whatever "Amazon Linux" is because it uses an old enough glibc but has a new enough gcc.

jhasse 2 hours ago | parent [-]

You can choose the version. There was apgcc from the (now dead) Autopackage project which did just that: https://github.com/DeaDBeeF-Player/apbuild

afishhh 14 hours ago | parent | prev [-]

> Hundreds of other widely-used open source libraries don't.

Correct me if I'm wrong but I don't think versioned symbols are a thing on Windows (i.e. they are non-portable). This is not a problem for glibc but it is very much a problem for a lot of open source libraries (which instead tend to just provide a stable C ABI if they care).

Const-me 11 hours ago | parent [-]

> versioned symbols are a thing on Windows

There’re quite a few mechanics they use for that. The oldest one, call a special API function on startup like InitCommonControlsEx, and another API functions will DLL resolve differently or behave differently. A similar tactic, require an SDK defined magic number as a parameter to some initialization functions, different magic numbers switching symbols from the same library; examples are WSAStartup and MFStartup.

Around Win2k they did side by side assemblies or WinSxS. Include a special XML manifest into embedded resource of your EXE, and you can request specific version of a dependent API DLL. The OS now keeps multiple versions internally.

Then there’re compatibility mechanics, both OS builtin and user controllable (right click on EXE or LNK, compatibility tab). The compatibility mode is yet another way to control versions of DLLs used by the application.

Pretty sure there’s more and I forgot something.

cesarb 10 hours ago | parent | next [-]

> There’re quite a few mechanics they use for that. The oldest one, call a special API function on startup [...]

Isn't the oldest one... to have the API/ABI version in the name of your DLL? Unlike on Linux which by default uses a flat namespace, on the Windows land imports are nearly always identified by a pair of the DLL name and the symbol name (or ordinal). You can even have multiple C runtimes (MSVCR71.DLL, MSVCR80.DLL, etc) linked together but working independently in the same executable.

8 hours ago | parent | prev [-]
[deleted]
bsimpson 9 hours ago | parent | prev | next [-]

I only learned about glibc earlier today, when I was trying to figure out why the Nix version of a game crashes on SteamOS unless you unset some environ vars.

Turns out that Nix is built against a different version of glibc than SteamOS, and for some reason, that matters. You have to make sure none of Steam's libraries are on the path before the Nix code will run. It seems impractical to expect every piece of software on your computer to be built against a specific version of a specific library, but I guess that's Linux for you.

Imustaskforhelp 14 hours ago | parent | prev [-]

Ask your friend if he would CC0 the quote or similar (not sure if its possible but like) I can imagine this being a quote on t-shirts xD

Honestly I might buy a T-shirt with such a quote.

I think glibc is such a pain that it is the reason why we have so vastly different package management and I feel like non glibc things really would simplify the package management approach to linux which although feels solved, there are definitely still issues with the approach and I think we should still all definitely as such look for ways to solve the problem

seba_dos1 10 hours ago | parent [-]

Non-glibc distros (musl, uclibc...) with package managers have been a thing for ages already.

nineteen999 3 hours ago | parent [-]

And they basically hold under 0.01% of Linux marketshare and are completely shit.