▲ | brunokim 6 days ago | |
I'm unconvinced by the article criticism's, given they also employ their feels and few citations. > I appreciate that research has to be done on small models, but we know that reasoning is an emergent capability! (...) Even if you grant that what they’re measuring is reasoning, I am profoundly unconvinced that their results will generalize to a 1B, 10B or 100B model. A fundamental part of applied research is simplifying a real-world phenomenon to better understand it. Dismissing that for this many parameters, for such a simple problem, the LLM can't perform out of distribution just because it's not big enough undermines the very value of independent research. Tomorrow another model with double the parameters may or may not show the same behavior, but that finding will be built on top of this one. Also, how do _you_ know that reasoning is emergent, and not rationalising on top of a compressed version of the web stored in 100B parameters? | ||
▲ | ActionHank 6 days ago | parent | next [-] | |
I think that when you are arguing logic and reason with a group who became really attached to the term vibe-coding you've likely already lost. | ||
▲ | mirekrusin 6 days ago | parent | prev [-] | |
Feels like running psychology experiment with fruit flies because it's cheaper and extrapolating results to humans because it's almost the same thing but smaller. I'm sorry but the only hallucination here is that of the authors here. Does it really need to be said again that interesting results happen when you scale up only? This whole effort would be interesting if they did and plotted result while scaling something up. |