| ▲ | stonemetal12 7 days ago |
| > it was tested on a 4 layer deep toy model How do you see that impacting the results? It is the same algorithm just on a smaller scale. I would assume a 4 layer model would not be very good, but does reasoning improve it? Is there a reason scale would impact the use of reasoning? |
|
| ▲ | azrazalea_debt 7 days ago | parent | next [-] |
| A lot of current LLM work is basically emergent behavior. They use a really simple core algorithm and scale it up, and interesting things happen. You can read some of anthropic's recent papers to see some of this, like: They didn't expect LLMs could "lookahead" when writing poetry. However, when they actually went in and watched what was happening (there's details on how this "watching" works on their blog/in their studies) they found the LLM actually was planning ahead! That's emergent behavior, they didn't design it to do that, it just started doing due to the complexity of the model. If (BIG if) we ever do see actual AGI, it is likely to work like this. It's unlikely we're going to make AGI by designing some grand Cathedral of perfect software, it is more likely we are going to find the right simple principles to scale big enough to have AGI emerge. This is similar. |
| |
| ▲ | mrspuratic 6 days ago | parent | next [-] | | On that topic, it seems backwards to me: intelligence is not emergent behaviour of language, rather the opposite. | | |
| ▲ | 6 days ago | parent | next [-] | | [deleted] | |
| ▲ | danans 6 days ago | parent | prev [-] | | Perception and interpretation can very much be influenced by language (Sapir-Wharf hypothesis), so to the extent that perception and interpretation influence intelligence, it's not clear that the relationship is only in one direction. | | |
| ▲ | archaeans 6 days ago | parent | next [-] | | "It would be naïve to imagine that any analysis of experience is dependent on pattern expressed in language." - Sapir It's hard to take these discussions on cognition and intelligence seriously when there is so much lossy compression going on. | | |
| ▲ | danans 6 days ago | parent [-] | | Sapir-Whorf was named after, but not postulated as a single theory by Sapir or Whorf. It's just a colloquialism for Linguistic Relativity (vs Universality). In its weak form, there are many examples of Linguistic Relativity. |
| |
| ▲ | zekica 6 days ago | parent | prev [-] | | Am I the exception? When thinking I don't conceptualize things in words - the compression would be too lossy. Maybe because I'm fluent in three languages (one germanic, one romance, one slavic)? | | |
| ▲ | danans 6 days ago | parent [-] | | Our brains reason in many domains depending on the situation. For domains built primarily on linguistic primitives (legal writing), we do often reason through language. In other domains (i.e spatial) we reason through vision or sound. We experience this distinction when we study the formula vs the graph of a mathematical function, the former is linguistic, the latter is visual-spatual. And learning multiple spoken languages is a great way to break out of particularly rigid reasoning patterns, and as important, countering biases that are influenced by your native language. |
|
|
| |
| ▲ | pinoy420 7 days ago | parent | prev [-] | | [dead] |
|
|
| ▲ | NitpickLawyer 7 days ago | parent | prev | next [-] |
| There's prior research that finds a connection between model depth and "reasoning" ability - https://arxiv.org/abs/2503.03961 A depth of 4 is very small. It is very much a toy model. It's ok to research this, and maybe someone will try it out on larger models, but it's totally not ok to lead with the conclusion, based on this toy model, IMO. |
|
| ▲ | okasaki 7 days ago | parent | prev [-] |
| Human babies are the same algorithm as adults. |
| |
| ▲ | mirekrusin 6 days ago | parent [-] | | This analogy would mean very large model that didn't finish training yet. Tiny model like this is more like doing study on fruit flies and extrapolating results to humans. | | |
| ▲ | archaeans 6 days ago | parent | next [-] | | Every argument about LLMs of that is a variant of "humans same" is self defeating because it assumes a level of understanding of human cognition and the human brain that doesn't really exist outside of the imagination of people with a poor understanding of neuroscience. | |
| ▲ | okasaki 6 days ago | parent | prev [-] | | Some humans never attain intelligence beyond early childhood. |
|
|