| ▲ | gertlabs 2 hours ago | ||||||||||||||||
Surprisingly, LLMs are actually much worse at reasoning in Python than other common programming languages for agentic coding tasks. Data here: https://gertlabs.com/rankings?mode=agentic_coding | |||||||||||||||||
| ▲ | BariumBlue an hour ago | parent | next [-] | ||||||||||||||||
Hah, I was just thinking that Python likely has a vast ocean of training data, but it's likely of lower quality, being much of it is written by beginners and those who aren't primarily programmers. | |||||||||||||||||
| |||||||||||||||||
| ▲ | isityettime 41 minutes ago | parent | prev | next [-] | ||||||||||||||||
I would love to see how they do with functional languages and especially Lisps here. I've noticed pretty good performance with Emacs Lisp relative to overall model strength, but I haven't used LLMs to application code in any such languages. It would also be interesting to see how Python compares to other languages in its niche (Ruby, Perl, Raku). Thanks for putting this together! It's interesting. | |||||||||||||||||
| ▲ | bushbaba an hour ago | parent | prev | next [-] | ||||||||||||||||
Cool to see my hunch be backed by data. Python is a scripting language with OOP bolted on. Means there’s not really a styling consistency that other languages have, with things tending to look like PHP, a collection of various scripts that invoke one another | |||||||||||||||||
| ▲ | rossjudson an hour ago | parent | prev | next [-] | ||||||||||||||||
My standard joke here: Q: Say, what does this Python code do? A: Nobody f&%^ing knows. | |||||||||||||||||
| |||||||||||||||||
| ▲ | altmanaltman 18 minutes ago | parent | prev | next [-] | ||||||||||||||||
Hey they said it had a lot of training data, not necessarily high-quality python code training data. | |||||||||||||||||
| ▲ | ricardo_lien 20 minutes ago | parent | prev [-] | ||||||||||||||||
This surprised me, but I can understand it - Python sucks in many ways lol. | |||||||||||||||||