Remix.run Logo
marliechiller 13 hours ago

One thing im wondering with the LLM age we seem to be entering: is there value in picking up a language like this if theres not going to be a corpus of training data for an LLM to learn from? Id like to invest the time to learn Gleam, but I treat a language as a tool, or a means to an end. I feel like more and more I'm reaching for the tool to get the job done most easily, which are languages that LLMs seem to gel with.

thefaux 13 hours ago | parent | next [-]

In the medium to long term, if LLMs are unable to easily learn new languages and remap the knowledge they gained from training on different languages, then they will have failed in their mission of becoming a general intelligence.

victorbjorklund 13 hours ago | parent | prev | next [-]

I feel that was more true 1-2 years ago. These days I find Claude Code write almost as good (or as bad depending on your perspective) Elixir code as JavaScript code and there must be less Elixir code in the training data.

stanmancan 13 hours ago | parent | next [-]

There's certainly a lot more JS code out there to train on, but the quality of the Elixir code is likely overall much better.

jszymborski 12 hours ago | parent | prev | next [-]

I personally find it much more painful to generate valid Rust code that compiles and does what I want than e.g. valid python code that runs and does what I want.

dnautics 12 hours ago | parent [-]

i think it's pretty clear that some of "the things you expect to make an LLM good at a language" (like strong typing) are not actually the case. other things like "don't indirect your code by jumping to something unexpected" might be more important.

manquer 12 hours ago | parent [-]

If anything llms would be poorer in codegen for static languages because they are more verbose - More tokens to generate and use limited context windows parsing code.

The advantage rather for llms in strongly typed languages is that compilers can catch errors early and give the model early automated feedback so you don’t have to.

With weakly typed (and typically interpreted) languages they will need to run the code which maybe quite slow to do so or not realistic.

Simply put agentic coding loops prefer stronger static analysis capabilities.

dnautics 8 hours ago | parent [-]

not necessarily. if your refactoring loop requires too many cycles you'll fall off the attention context window.

also, some nonstatic languages have a habit of having least surprise in their codebases -- it's often possible to effectively guess the types flowing through at the callsite. zero refactoring feedback necessary is better than even one.

agos 12 hours ago | parent | prev [-]

in my daily experience Claude Code writes better Elixir code than JS (React). Surely this has to do with the quality of the training material

pjm331 10 hours ago | parent [-]

Can’t confirm or deny comparison with JS but I can second that it write decent elixir

The only problem I’ve ever had was on maybe 3 total occasions it’s added a return statement, I assume because of the syntax similarity with ruby

aryonoco 7 hours ago | parent [-]

I’ve found Claude (at least until Opus 4) would routinely fail at writing a bash script. For example it would end an if block with }. Or get completely lost with environment variables and subshells.

But those are exactly the same mistakes most humans make when writing bash scripts, which makes them inherently flaky.

Ask it to write code in a language with types, a “logical” syntax where there are no tricky gotchas, with strict types, and a compiler which enforces those rules, and while LLMs struggle to begin with, they eventually produce code which is nearly clean and bug free. Works much better if there is an existing codebase where they can observe and learn from existing patterns.

On the other hand asking them to write JavaScript and Python, sure they fly, but they confidently implement code full of hidden bugs.

The whole “amount of training data” is completely overblown. I’ve seen code do well even with my own made up DSL. If the rules are logical and you explain the rules to it and show it existing patterns, the can mostly do alright. Conversely there is so much bad JavaScript and Python code in their training data that I struggle to get them to produce code in my style in these languages.

isodev 10 hours ago | parent | prev | next [-]

Claude reads and writes Gleam just fine. I think as long as the language syntax is well documented (with examples) and has meaningful diagnostics, LLMs can be useful. Gleam has both brilliant docs and diagnostics rivalling Rust. Gleam is also very well designed language wise, not many reserved words, very explicit APIs… also something that helps LLMs.

Contrast with the likes of Swift - been around for years but it’s so bloated and obscure that coding agents (not just humans) have problems using it fully.

kace91 13 hours ago | parent | prev | next [-]

If you just see language as a tool, unless you’re self employed or working in open source, wouldn’t the lack of job market demand for it be the first blocker?

macintux 12 hours ago | parent [-]

If you're fortunate, you can find companies with a passion for good code who use lesser-known languages. Picking Erlang, or Haskell, or OCaml generally filters out candidates who don't share your interest in seeing what can be done outside the mainstream.

epolanski 11 hours ago | parent [-]

It's funny you mention Haskell, because it's one of those languages many love but can't find jobs in it even if they halved their salaries.

c-hendricks 13 hours ago | parent | prev | next [-]

I hope this isn't the future of "new" languages. Hopefully newer AI tools can actually learn a language and they won't be the limiting factor.

bbatha 12 hours ago | parent | next [-]

I’m more interested in what happens when a language is designed specifically for llms? When doing vibe coding a lot of code is a lot more verbose than I’d do normally. Do we drop down the abstraction level because llms are just so good a churning out boilerplate?

epolanski 11 hours ago | parent [-]

Llms are already good at churning boilerplate, so the next step really is making them good as so they develop taste and architectural consistency imho.

dugmartin 12 hours ago | parent | prev | next [-]

I think as ai tools actually learn languages that functional languages will win out as they are much easier to reason about.

whimsicalism 13 hours ago | parent | prev | next [-]

it’s easy enough for them as soon as they have an RL environment for the language

positron26 13 hours ago | parent | prev [-]

This is the answer. We need online learning for our own code bases and macros.

perrygeo 6 hours ago | parent | prev | next [-]

The Gleam language, yes all of it, fits in a context window (https://tour.gleam.run/everything/)

I have similar concerns to you - how well a language works with LLMs is indeed an issue we have to consider. But why do you assume that it's the volume of training data that drives this advantage? Another assumption, equally if not more valid IMO, is that languages which have fewer, well-defined, simpler constructs are easier for LLMs to generate.

Languages with sprawling complexity, where edge cases dominate dev time, all but require PBs of training data to be feasible.

Languages that are simple (objectively), with a solid unwavering mental model, can match LLMs strengths - and completely leap-frog the competition in accurate code gen.

dragonwriter 13 hours ago | parent | prev | next [-]

Its pretty much the same thing as in every previous age, where not having a community of experience and the supporting materials they produce has been a disadvantage to early adopters of a new language, so the people that used it first were people with a particular need that it seemed to address that offset that for them, or that had a particular interest in being in the vanguard.

And those people are the people that develop the body of material that later people (and now LLMs) learn from.

armchairhacker 13 hours ago | parent | prev | next [-]

Gleam isn’t a very unique language. The loss from generalizing may be less than the improved ergonomics, if not now then as LLMs improve.

mikepurvis 13 hours ago | parent [-]

I don't know Gleam at all so I can't comment on that specifically, but I think everyone has the experience of a coworker who writes C++ as if it's C or Python as if its Java or whatever else.

A language doesn't have to be unique to still have a particular taste associated with its patterns and idioms, and it would unfortunate if LLM influence had the effect of suppressing the ability for that new style to develop.

christophilus 11 hours ago | parent | prev | next [-]

I recently built something in Hare (a very niche new language), and Claude Code was helpful. No where near as good as it is with TypeScript. But it was good enough that I don’t LLMs being in the top 5 reasons a language would fail to get adopted.

Hammershaft 13 hours ago | parent | prev | next [-]

This was one of my bigger worries for LLM coding: we might develop path dependence on the largest tools and languages.

kryptiskt 9 hours ago | parent | prev | next [-]

On the other hand, if you write a substantial amount of code in a niche languages, the LLMs will pick up your coding style as it's in a sizable chunk of the training corpus.

dnautics 12 hours ago | parent | prev | next [-]

claude is really good at elixir. IME, It's really really good with a few "unofficial" tweaks to the language/frameworks, but this could be my bias. the LLM cutoff was a fear of mine, but i think it's actually the opposite. we know that as few as 250 documents can "poison" an LLM, i suspect that (for now) a small language with very higg quality examples can "poison" LLMs for the better.

epolanski 11 hours ago | parent | prev | next [-]

Of course there is, especially if you believe that LLMs further improve on reasoning.

timeon 13 hours ago | parent | prev | next [-]

Seems like you are not target audience for these new languages and that is OK. But I guess there are still many people that want to try new things (on their own even).

ModernMech 13 hours ago | parent | prev | next [-]

Yes, because LLMs don't change the fact that different programming languages have different expressive capabilities. It's easier to say some things in some languages over others. That doesn't change if it's an LLM writing the code; LLMs have finite context windows and limited attention. If you can express an algorithm in 3000 loc in one language but 30 loc in another, the more expressive language is still preferred, even if the LLM can spit out the 3000 lines in 1s. The reason being if the resulting codebase is 10 - 100x larger than it needs to be, that has real costs that are not mitigated by LLMs or agents. All things being equal, you'd still prefer the right tool for the job, which does not imply we should use Python for everything because it dominates the training set, it means we should make sure LLMs have capabilities to write other programming languages equally well before we rely on them too much.

jedbrooke 13 hours ago | parent | prev [-]

where do you think the corpus of training data comes from?