| ▲ | reitzensteinm an hour ago | |
Coding is a verifiable domain, so I think you actually have it backwards on that first point. We can now synthesize Stack Overflow sized datasets for an arbitrary new language, and use those to train LLMs to understand it. It's expensive of course, but if a new language is genuinely better for LLMs to write and understand, that would not be an issue. | ||