Remix.run Logo
gertlabs 4 days ago

Since programming is increasingly offloaded to LLMs and English is the main way engineers interact with code, it's interesting to see how LLMs reason in different programming languages.

In our benchmarking, we've found LLMs perform comparably between languages for one-shot coding submissions, slightly favoring more popular languages. But if you give frontier LLMs a harness and let them iterate / fix compilation errors, they actually significantly outperform in Rust. Meaning, they come up with more insightful ideas when developing Rust, than for example Javascript.

Scroll down to the language comparison chart: https://gertlabs.com/?agentic=agentic

SIRHAMY 3 days ago | parent [-]

Anecdotally I think AI is quite good at langs with lots of training data: C#, TypeScript Rust.

I also think it's much better with languages with more guardrails and clear syntax: think expressive types (sum types), brackets, and linters / compile checks.

Rust has expressive types and lots of compile checks to avoid classes of bugs via ownership / lifetimes and I think makes it a very good tool for agents to use.

gertlabs 3 days ago | parent [-]

I partially agree, but C++ is the second best agentic language! (of 6 tested). LLMs are pretty good at reading machine output. My pet theory is that it has more to do with the training data in lower level languages being of a more interesting algorithmic variety, on average.