Remix.run Logo
jchw 7 hours ago

Me as a kid thought this would be a great idea, and started implementing a PE binfmt. I actually did make a rudimentary PE binfmt, though it started to occur to me how different Windows and Linux really were as I progressed.

For example, with ELF/UNIX, the basic ELF binfmt is barely any more complex than what you'd probably expect the a.out binfmt to be: it maps sections into memory and then executes. Dynamic linking isn't implemented; instead, similar to the interpreter of a shell script, an ELF binary can have an interpreter (PT_INTERP) which is loaded in lieu of the actual binary. This way, the PT_INTERP can be set to the well-known path of the dynamic linker of your libc, which itself is a static ELF binary. It is executed with the appropriate arguments loaded onto the stack and the dynamic linker starts loading the actual binary and its dependencies.

Windows is totally different here. I mean, as far as I know, the dynamic linker is still in userland, known as the Windows Loader. However, the barrier between the userland and kernel land is not stable for Windows NT. Syscall numbers can change during major updates. And, sometimes, implementation details are split between the kernel and userland. Now, in order to be able to contribute to Wine and other projects, I've had to be very careful how I discover how Windows internals works, often by reading other's writings and doing careful black box analysis (for some of this I have work I can show to show how I figured it out.) But for example, the PEB/TIB structures that store information about processes/threads seems to be something that both the userland and kernel components both read and modify. For dynamic linking in particular, there are some linked lists in the PEB that store the modules loaded into the process, and I believe these are used by both the Windows loader and the kernel in some cases.

The Windows NT kernel also just takes on a lot more responsibilities. For example, input. I can literally identify some of the syscalls that go into input handling and observe how they change behavior depending on the last result of PeekMessage. The kernel also appears to be the part of the system that handles event coalescing and priority. It's nothing absurd (the Wine project has already figured out how a lot of this works) but it is a Huge difference from Linux where there's no concept of "messages" and probably shouldn't be.

So the equivalent of the Windows NT kernel services would often be more appropriate to put in userland on Linux anyways, and Wine already does that.

It would still be interesting to attempt to get a Windows XP userland to boot directly on a Linux kernel, but I don't think you'd ever end up with anything that could ever be upstreamed :)

Maybe we should do the PE binfmt though. I am no longer a fan of ELF with it's symbol conflicts and whatnot. Let's make Linux PE-based so we can finally get icons for binaries without needing to append a filesystem to the end of it :)

direwolf20 7 hours ago | parent | next [-]

You can already use binfmt_misc to instruct the kernel to execute PE binaries with Wine.

jchw 6 hours ago | parent [-]

I mean something a bit different. I mean using PE binaries to store Linux programs, no Wine loader.

Of course, this is a little silly. It would require massively rethinking many aspects of the Linux userland, like the libc design. However, I honestly would be OK with this future. I don't really care that much for ELF or its consequences, and there are PE32+ binaries all over the place anyways, so may as well embrace it. Linux itself is often a PE32+ binary, for the sake of EFI stub/UKI.

(You could also implement this with binfmt_misc, sure, but then you'd still need at least an ELF binary for the init binary and/or a loader.)

(edit: But like I said, it's a little silly. It breaks all kinds of shit. Symbol interposition stops working. The libdl API breaks. You can't LD_PRELOAD. The libpthread trick to make errno a thread local breaks. Etc, etc.)

direwolf20 5 hours ago | parent [-]

Wine has no problem loading Linux programs in PE format. It doesn't enforce that you actually call any Windows functions and it doesn't stop you making Linux system calls directly.

jchw 4 hours ago | parent [-]

Well yes, but you'd be spawning a wineserver and running wineboot and all kinds of baggage on top, all for the very simple task of mapping and executing a PE binary, and of course you would still wind up needing ELF... for the Wine loader and all of the dependencies that it has (like a libc, though you could maybe use a statically-linked musl or something to try to minimize it.)

Meanwhile the actual process of loading a PE binary is relatively trivial. It's trivial enough that it has been implemented numerous times in different forms by many people. Hell, I've done it numerous times myself, once for game hacking and once in pure Go[1] as a stubborn workaround for another problem.

Importing an entire Wine install, or even putting the effort into stripping Wine down for this purpose, seems silly.

But I suppose the entire premise is a little silly to begin with, so I guess it's not that unreasonable, it's just not what I am imagining. I'm imagining a Linux userland with simply no ELF at all.

[1]: https://github.com/jchv/go-winloader - though it doesn't do linking recursively, since for this particular problem simply calling LoadLibrary is good enough.

saghm 2 hours ago | parent | prev [-]

I recently learned that Windows binaries contain metadata for what version they are (among other things, presumably). I was discussing in-progress work on making a mod manager for a popular game work on Linux with the author of the tool, and they mentioned that one of the things that surprised them was not being able to rely on inspection of a native library used by most mods to determine what version they had installed on Linux like they could on Windows. It had never occurred to them that this wasn't a first-class feature of Linux binary formats, and I was equally surprised to find out that it was a thing on Windows given that I haven't regularly used Windows since before I really had much of a concept of what "metadata in a binary format" would even mean.

1718627440 14 minutes ago | parent [-]

Are you talking about the "Linux version" it targets or the version of the library? If its the latter, then it is the case, that versioning works per symbol instead of per library, so that a newer library can still contain the old symbols. If you want the latest version a library implements, you could search all symbols and look for the newest symbol version.

If you want it the other way around you could look at the newest symbol the library wants.