| ▲ | thorum 6 hours ago | ||||||||||||||||||||||||||||||||||||||||
Developed by Jordan Hubbard of NVIDIA (and FreeBSD). My understanding/experience is that LLM performance in a language scales with how well the language is represented in the training data. From that assumption, we might expect LLMs to actually do better with an existing language for which more training code is available, even if that language is more complex and seems like it should be “harder” to understand. | |||||||||||||||||||||||||||||||||||||||||
| ▲ | adastra22 5 hours ago | parent | next [-] | ||||||||||||||||||||||||||||||||||||||||
I don’t think that assumption holds. For example, only recently have agents started getting Rust code right on the first try, but that hasn’t mattered in the past because the rust compiler and linters give such good feedback that it immediately fixes whatever goof it made. This does fill up context a little faster, (1) not as much as debugging the problem would have in a dynamic language, and (2) better agentic frameworks are coming that “rewrite” context history for dynamic on the fly context compression. | |||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||
| ▲ | vessenes 6 hours ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||
A lot of this depends on your workflow. A language with great typing, type checking and good compiler errors will work better in a loop than one with a large surface overhead and syntax complexity, even if it's well represented. This is the instinct behind, e.g. https://github.com/toon-format/toon, a json alternative format. They test LLM accuracy with the format against JSON, (and are generally slightly ahead of JSON). Additionally just the ability to put an entire language into context for an LLM - a single document explaining everything - is also likely to close the gap. I was skimming some nano files and while I can't say I loved how it looked, it did look extremely clear. Likely a benefit. | |||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||
| ▲ | nl 2 hours ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||
> My understanding/experience is that LLM performance in a language scales with how well the language is represented in the training data. This isn't really true. LLMs understand grammars really really well. If you have a grammar for your language the LLM can one-shot perfect code. What they don't know is the tooling around the language. But again, this is pretty easily fixed - they are good at exploring cli tools. | |||||||||||||||||||||||||||||||||||||||||
| ▲ | Zigurd 6 hours ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||
It's not just how well the language is represented. Obscure-ish APIs can trip up LLMs. I've been using Antigravity for a Flutter project that uses ATProto. Gemini is very strong at Dart coding, which makes picking up my 17th managed language a breeze. It's also very good at Flutter UI elements. It was noticeably less good at ATProto and its Dart API. The characteristics of failures have been interesting: As I anticipated it might be, an over ambitious refactoring was a train wreck, easily reverted. But something as simple as regenerating Android launcher icons in a Flutter project was a total blind spot. I had to Google that like some kind of naked savage running through the jungle. | |||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||
| ▲ | nxobject 6 hours ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||
I think it's depressingly true of any novel language/framework at this point, especially if they have novel ideas. | |||||||||||||||||||||||||||||||||||||||||
| ▲ | NewsaHackO 3 hours ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||
I wonder if there is a way to create a sort of 'transpilation' layer to a new language like this for existing languages, so that it would be able to use all of the available training from other languages. Something that's like AST to AST. Though I wonder if it would only work in the initial training or fine-tuning stage. | |||||||||||||||||||||||||||||||||||||||||
| ▲ | cmrdporcupine 6 hours ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||
Not my experience, honestly. With a good code base for it to explore and good tooling, and a really good prompt I've had excellent results with frankly quite obscure things, including homegrown languages. As others said, the key is feedback and prompting. In a model with long context, it'll figure it out. | |||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||
| ▲ | whimsicalism 6 hours ago | parent | prev [-] | ||||||||||||||||||||||||||||||||||||||||
easy enough to solve with RL probably | |||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||