Remix.run Logo
recursive 3 days ago

You might have been hearing some of that hate from me. I definitely don't understand COM, but I've had to use it once or twice. It's pretty far outside what I normally work on, which is all high-level garbage collected languages. I don't know if that's even the right dimension to distinguish it. I couldn't figure out how to use COM or what it's purpose was.

The task was some automated jobs doing MS word automation. This all happened about 20 years ago. I never did figure out how to get it to stop leaking memory after a couple days of searching. I think I just had the process restart periodically.

Compared to what I was accustomed to COM seemed weird and just unnecessarily difficult to work with. I was a lot less experienced then, but I haven't touched COM since. I still don't know what the intent of COM is or where it's documented, and nor have I tried to figure it out. But it's colored my impression of COM ever since.

I think there may be a lot of people like me. They had to do some COM thing because it was the only way to accomplish a task, and just didn't understand. They randomly poked it until it kind of worked, and swore never to touch it again.

duped 3 days ago | parent | next [-]

> I still don't know what the intent of COM is

COM is an ABI (application binary interface). You have two programs, compiled in different languages with different memory management strategies, potentially years apart. You want them to communicate. You either

-1 use a Foreign Function Interface (FFI) provided to those languages -2 serialize/deserialize data and send it over some channel like a socket

(2) is how the internet works so we've taken to doing it that way for many different systems, even if they don't need it. (1) is how operating systems work and how the kernel and other subsystems are exposed to user space.

The problem with FFI is that it's pretty barebones. You can move bytes and call functions, but there's no standard way of composing those bytes and function calls into higher level constructs like you use in OOP languages.

COM is a standard for defining that FFI layer using OOP patterns. Programs export objects which have well defined interfaces. There's a root interface all objects implement called "Unknown", and you can find out if an object supports another interface by calling `queryInterface()` with the id of a desired interface (all interfaces have a globally unique ID). You can make sure the object doesn't lose its data out of nowhere by calling `addRef()` to bump its reference count, and `release()` to decrement it (thus removing any ambiguity over memory management, for the most part - see TFA for an example where that fails).

> where it's documented

https://learn.microsoft.com/en-us/windows/win32/com/the-comp...

asveikau 3 days ago | parent [-]

> You have two programs, compiled in different languages with different memory management strategies, potentially years ap

Sometimes they are even the same language. Windows has a few problems that I haven't seen in the Unix world, such as: each DLL potentially having an incompatible implementation of malloc, where allocating using malloc(3) in one DLL then freeing it with free(3) in another being a crash.

snuxoll 11 hours ago | parent | next [-]

> where allocating using malloc(3) in one DLL then freeing it with free(3) in another being a crash.

This can still happen all the time on UNIX systems. glibc's malloc implementation is a fine general purpose allocator, but there's plenty of times where you want to bring in tcmalloc, jemalloc, etc. Of course, you hope that various libraries will resolve to your implementation of choice when the linker wires everything up, but they can opt not to just as easily.

asveikau 9 hours ago | parent [-]

No actually, this doesn't happen the same way on modern Unix. The way symbol resolution works is just not the same. A library asking for an extern called "malloc" will get the same malloc. To use those other allocators, you would typically give them a different symbol name, or make the whole process use the new one.

A dll import on Windows explicitly calls for the DLL by name. You could have some DLLs explicitly ask for a different version of the Visual Studio runtime, or with different threading settings, release vs debug etc., and a C extern asking for simply the name "malloc", no other details, will resolve to that, possibly incompatible with another DLL in the same process despite the compiler's perspective of it just being extern void *malloc(size_t) and no other detail, no other decoration, rename of the symbol etc.. there might be a rarely used symbol versioning pragma to accomplish similar on a modern gcc/clang/elf setup but it's not the way anybody does this.

I would argue that the modern Unix way, with these limitations, is better, by the way. Maybe some older Unix in the early days of shared libraries, early 90s or so, tried what Windows does, I don't know. But it's not common today.

snuxoll 7 hours ago | parent [-]

> No actually, this doesn't happen the same way on modern Unix. The way symbol resolution works is just not the same. A library asking for an extern called "malloc" will get the same malloc. To use those other allocators, you would typically give them a different symbol name, or make the whole process use the new one.

This is, yes, the behavior of both the ELF specification as well as the GNU linker.

I'm not here to get into semantics of symbol namespaces and resolution though, I can just as easily link a static jemalloc into an arbitrary ELF shared object and use it inside for every allocation and not give a damn about what the global malloc() symbol points to. There's a half dozen other ways I can have a local malloc() symbol as well instead of having the linker bind the global one.

Which, is the entire point I'm trying to make. Is this a bigger problem on Windows versus UNIX-like platforms due to the way runtime linker support is handled? Yes. Is it entirely possible to have the same issue, however? Yes, absolutely.

pjmlp 2 days ago | parent | prev [-]

Because C standard library isn't part of the OS.

Outside UNIX, the C standard library is a responsibility of the C compiler vendor, not the OS.

Nowadays Windows might seem the odd one, however 30 years ago the operating system was more diverse.

You will also find similar issues on dynamic libraries in mainframes/micros from IBM and Unisys, still being sold.

asveikau 2 days ago | parent [-]

Yeah I know the reasons for this, I'm just saying it's not usual coming from currently dominant unix-like systems.

asveikau 2 days ago | parent | prev [-]

> I couldn't figure out how to use COM or what it's purpose was.

A shorter version than the other reply:

COM allows you to have a reference counted object with callable virtual methods. You can also ask for different virtual methods at runtime (QueryInterface).

Some of the use cases include: maybe those methods are implemented in a completely different programming language that you are using, for example I think one of the historical ones is JavaScript or vbscript interacting with C++. It standardizes the virtual calls in such a way that you can throw in such an abstraction. And since reference counting happens via a virtual call, memory allocation is also up to the callee. Another historical use case is to have the calls be handled in a different process.