| ▲ | NitpickLawyer 3 hours ago | |||||||||||||||||||||||||||||||
It's a bit disappointing that people are still re-hashing the same "it's in the training data" old thing from 3 years ago. It's not like any LLM could 1for1 regurgitate millions of LoC from any training set... This is not how it works. A pertinent quote from the article (which is a really nice read, I'd recommend reading it fully at least once): > Previous Opus 4 models were barely capable of producing a functional compiler. Opus 4.5 was the first to cross a threshold that allowed it to produce a functional compiler which could pass large test suites, but it was still incapable of compiling any real large projects. My goal with Opus 4.6 was to again test the limits. | ||||||||||||||||||||||||||||||||
| ▲ | simonw an hour ago | parent | next [-] | |||||||||||||||||||||||||||||||
This is a good rebuttal to the "it was in the training data" argument - if that's how this stuff works, why couldn't Opus 4.5 or any of the other previous models achieve the same thing? | ||||||||||||||||||||||||||||||||
| ▲ | wmf 3 hours ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||
In this case it's not reproducing training data verbatim but it probably is using algorithms and data structures that were learned from existing C compilers. On one hand it's good to reuse existing knowledge but such knowledge won't be available if you ask Claude to develop novel software. | ||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||
| ▲ | lossolo 2 hours ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||
They couldn't do it because they weren't fine-tuned for multi-agent workflows, which basically means they were constrained by their context window. How many agents did they use with previous Opus? 3? You've chosen an argument that works against you, because they actually could do that if they were trained to. Give them the same post-training (recipes/steering) and the same datasets, and voila, they'll be capable of the same thing. What do you think is happening there? Did Anthropic inject magic ponies? | ||||||||||||||||||||||||||||||||
| ▲ | calebhwin 3 hours ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||
[dead] | ||||||||||||||||||||||||||||||||
| ▲ | zephen 3 hours ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||
> It's a bit disappointing that people are still re-hashing the same "it's in the training data" old thing from 3 years ago. They only have to keep reiterating this because people are still pretending the training data doesn't contain all the information that it does. > It's not like any LLM could 1for1 regurgitate millions of LoC from any training set... This is not how it works. Maybe not any old LLM, but Claude gets really close. | ||||||||||||||||||||||||||||||||
| ▲ | skydhash 3 hours ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||
Because for all those projects, the effective solution is to just use the existing implementation and not launder code through an LLM. We would rather see a stab at fixing CVEs or implementing features in open source projects. Like the wifi situation in FreeBSD. | ||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||
| ▲ | falloutx 3 hours ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||
They can literally print out entire books line by line. | ||||||||||||||||||||||||||||||||
| ▲ | lunar_mycroft 3 hours ago | parent | prev [-] | |||||||||||||||||||||||||||||||
LLMs can regurgitate almost all of the Harry Potter books, among others [0]. Clearly, these models can actually regurgitate large amounts of their training data, and reconstructing any gaps would be a lot less impressive than implementing the project truly from scratch. (I'm not claiming this is what actually happened here, just pointing out that memorization is a lot more plausible/significant than you say) [0] https://www.theregister.com/2026/01/09/boffins_probe_commerc... | ||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||