▲ | magicalhippo 5 days ago | |||||||||||||||||||||||||
As I understand it the point of the article isn't to train a LLM from scratch, it's to teach a non-reasoning model to reason without additional explicit training data. | ||||||||||||||||||||||||||
▲ | YeGoblynQueenne 5 days ago | parent [-] | |||||||||||||||||||||||||
The abstract does use the term "from scratch": >> To overcome this limitation, we introduce R-Zero, a fully autonomous framework that generates its own training data from scratch. Giving the benefit of the doubt, they're just using it wrong, but the way they use it sure reads like they claim they found a way to initialise LLMs with 0 data. Only the absurdity of the claim protects the reader from such misunderstanding, and that's never a good thing in a research paper. | ||||||||||||||||||||||||||
|