Remix.run Logo
robviren 2 hours ago

For me it is an active question if coding training data "purity" matters. Python has Go on volume, but within that is a ton of API changes, language changes, etc. Is that free regularization or does it poison the dataset? As the author points out Go code is nominal because basically all published Go code looks the same and the library APIs are frozen in time to some degree.

I actually spent some time trying to get to the bottom of what a logical extension of this would be. An entirely made up language spec for an idealized language it never saw ever, and therefore had no bad examples of it. Go is likely the closest for the many reasons people call it boring.