Remix.run Logo
bigstrat2003 3 days ago

> it's easy to blame the use of LLM to find the interface, but really this is a matter of needing to understand the COM calling conventions in order to interact with it.

Sure, but I think that this perfectly illustrates why LLMs are not good at programming (and may well never get good): they don't actually understand anything. An LLM is fundamentally incapable of going "this is COM so let me make sure that the function signature matches the calling conventions", it just generates something based on the code it has seen before.

I don't blame the authors for reaching for an LLM given that Microsoft has removed the C++ example code (seriously, what's up with that nonsense?). But it does very nicely highlight why LLMs are such a bad tool.

omneity 2 days ago | parent | next [-]

You might actually get that desired behavior through reasoning, or if the model was reinforced for coding workflows involving COM, or at least enough stack diversity for the model to encounter the need to develop this capability.

In the case of LLMs with reasoning, they might pull this off because reasoning is in fact a search in the direction of extra considerations that improve its performance on the task. This is measured by the verifier during reasoning training, which the LLM learns to emulate during inference hence improved performance.

As for RL coding training, the difference can be slightly blurry since reasoning is also done with RL, but for coding models specifically they also discover additional considerations, or even recipes, through self play against a code execution environment. If that environment includes COM and the training data has COM-related tasks, then the process has a chance to discover the behavior you described and reinforce it during training increasing its likelihood during actual coding.

LLMs are not really just autocomplete engines. Perhaps the first few layers or for base models can be seen as such, but as you introduce instruct and reinforcement tuning LLMs build progressively higher levels of conceptual abstractions from words to sentences to tasks like CNNs learn basic geometric features then composing those into face parts and so on.

piker 3 days ago | parent | prev [-]

In defense of the LLM here: learning COM from scratch given its lack of accessible documentation would have forced us to reach for C# for this minor project.

The LLM gave us an initial boost of productivity and (false) confidence that enabled us to get at the problem with Rust. While the LLM's output was flawed, using it did actually cause us to learn a lot about COM by allowing us to even getting started. That somewhat flies in the face of a lot of the "tech debt" criticisms levied at LLMs (including by me). Yes, we accumulated a bit of debt while working on the project, but were in this case able to pay it off before shipping and it gave us the leverage we needed to approach this problem using pure Rust.