Remix clone Hacker News

new | show | ask | jobs Github

	▲	lexicality an hour ago
		a lot of the training data is either for python 2 or just generally very low quality
	▲	stuaxo an hour ago \| parent \| next [-]
		The quality issue doesn't seem unique to Python. The versioning issue I've seen across libraries that version change in many languages. I don't tend to hit Python 2 issues using LLMs with it, but I do hit library things (e.g. Pydantic likes to make changes between libraries - or loads of the libraries used a lot by AI companies).
	▲	prodigycorp an hour ago \| parent \| prev [-]
		That could be it. I still see LLMs fail a set of static typing challenges that I created a couple years ago as a benchmark. Google models still fail it. I wonder if the lack of typing in a lot of the training data makes python harder to reason about?