Remix.run Logo
Insanity 4 days ago

100% this. I tried haskelling with LLMs and it’s performance is worse compared to Go.

Although in fairness this was a year ago on GPT 3.5 IIRC

diggan 4 days ago | parent | next [-]

> Although in fairness this was a year ago on GPT 3.5 IIRC

GPT3.5 was impressive at the time, but today's SOTA (like GPT 5 Pro) are almost night-and-difference both in terms of just producing better code for wider range of languages (I mostly do Rust and Clojure, handles those fine now, was awful with 3.5) and more importantly, in terms of following your instructions in user/system prompts, so it's easier to get higher quality code from it now, as long as you can put into words what "higher quality code" means for you.

ocharles 4 days ago | parent | prev | next [-]

I write Haskell with Claude Code and it's got remarkably good recently. We have some code at work that uses STM to have what is essentially a mutable state machine. I needed to split a state transition apart, and it did an admirable job. I had to intervene once or twice when it was going down a valid, but undesirable approach. This almost one shot performance was already a productivity boost, but didn't quite build. What I find most impressive now is the "fix" here is to literally have Claude run the build and see the errors. While GHC errors are verbose and not always the best it got everything building in a few more iterations. When it later got a test failure, I suggested we add a bit more logging - so it logged all state transitions, and spotted the unexpected transition and got the test passing. We really are a LONG way away from 3.5 performance.

r_lee 4 days ago | parent | prev | next [-]

I'm not sure I'd say "100% this" if I was talking about GPT 3.5...

verelo 4 days ago | parent | next [-]

Yeah, 3.5 was good when it came out but frankly anyone reviewing AI for coding not using sonnet 4.1, GPT-5 or equivalent is really not aware of what they've missed out on.

Insanity 4 days ago | parent | prev [-]

Yah, that’s a fair point. I had assumed it’d remain relatively similar given that the training data would be smaller for languages like Haskell versus languages like Python & JavaScript.

danielbln 4 days ago | parent | prev | next [-]

Post-training in all frontier models has improved significantly wrt to programming language support. Take Elexir, which LLMs could barely handle a test ago, but now support has gotten really good

computerex 4 days ago | parent | prev | next [-]

3.5 was a joke in coding compared to sonnet 4.

Insanity 4 days ago | parent | next [-]

Yup fair point, it’s been some time. Although vibe coding is more “miss” than “hit” for me.

pizza 4 days ago | parent | prev [-]

It's so thrilling that this is actually true in just a year

johnisgood 4 days ago | parent | prev [-]

I wrote some Haskell using Claude. It was great.