| ▲ | D-Machine 5 hours ago | |||||||||||||
This article is great. And the blog-article headline is interesting, but wrong. LLM's don't in general write plausible code (as a rule) either. They just write code that is (semantically) similar to code (clusters) seen in its training data, and which haven't been fenced off by RLHF / RLVR. This isn't that hard to remember, and is a correct enough simplification of what generative LLMs actually do, without resorting to simplistic or incorrect metaphors. | ||||||||||||||
| ▲ | kubb an hour ago | parent | next [-] | |||||||||||||
IIRC, the most code in its training data is Python. Closely followed by Web technologies (HTML, JS/TS, CSS). This corresponds to the most abundant developers. Many of them dedicated their entire careers to one technology. We stubbornly use the same language to refer to all software development, regardless of the task being solved. This lets us all be a part of the same community, but is also a source of misunderstanding. Some of us are prone to not thinking about things in terms of what they are, and taking the shortcut of looking at industry leaders to tell us what we should think. These guys consistently, in lockstep, talk about intelligent agents solving development tasks. Predominately using the same abstract language that gives us an illusion of unity. This is bound to make those of us solving the common problems believe that the industry is done. | ||||||||||||||
| ▲ | ozozozd 5 hours ago | parent | prev [-] | |||||||||||||
Exactly. It’s also easy to find yourself in the out-of-distribution territory. Just ask for some tree-sitter queries and watch Gemini 3, Opus 4.5 and GLM 5 hallucinate new directives. | ||||||||||||||
| ||||||||||||||