▲ | visarga a day ago | |
I think the Stochastic Parrots idea is pretty outdated and incorrect. LLMs are not parrots, we don't even need them to parrot, we already have perfect copying machines. LLMs are working on new things, that is their purpose, reproducing the same thing we already have is not worth it. The core misconception here is that LLMs are autonomous agents parroting away. No, they are connected to humans, tools, reference data, and validation systems. They are in a dialogue, and in a dialogue you quickly get into a place where nobody has ever been before. Take any 10 consecutive words from a human or LLM and chances are nobody on the internet stringed those words the same way before. LLMs are more like pianos than parrots, or better yet, like another musician jamming together with you, creating something together that none would do individually. We play our prompts on the keyboard and they play their "music" back to us. Good or bad - depends on the player at the keyboard, they retain most control. To say LLMs are Stochastic Parrots is to discount the contribution of the human using it. Related to intelligence, I think we have a misconception that it comes from the brain. No, it comes from the feedback loop between brain and environment. The environment plays a huge role in exploration, learning, testing ideas, and discovery. The social aspect also plays a big role, parallelizing exploration and streamlining exploitation of discoveries. We are not individually intelligent, it is a social, environment based process, not a pure-brain process. Searching for intelligence in the brain is like searching for art in the paint pigments and canvas cloth. | ||
▲ | delis-thumbs-7e a day ago | parent | next [-] | |
I think you are on to something. Chasing AGI is - I believe - ultimately useless endeavour, but we can already use the existing tools we have in ingenious and creative ways. And no I don’t mean endless barrage of AI lofi hip hop or the same ”cool” album cover with random kanji that all of them have. For instance, it is pretty amazing to have a private tutor which with you can discuss why Charles XII of Sweden ultimately failed in his war against Russia or why roughly 30% of people seems to have a personality that leans toward authoritanianism - this is how people have learned since the very beginning of language. But conversation is an art and you get out from it what you bring into it. It also does not give you a readymade result which you can immediatedly capitalise on, which is what investors want, but what could and can ultimately be useful to humanity. However, almost all models (worst is ChatGPT) are made virtually useless in this respect, since they are basically sycophantic yesmen - why on earth does an ”autocorrect on steroids” pretend to laugh at my jokes? Next step is not to built faster models or throw more computing power at them, bit to learn to play the piano. | ||
▲ | ttoinou a day ago | parent | prev | next [-] | |
The fact that it can copy smartly exactly ONE of the information in a given prompt (which is a complex sentence only humans could process before) and not others is absolutely a progress in computer science, and very useful. I’m still amazed by that everyday, I never thought I’d see an algorithm like that in my lifetime. (Calling it parroting is of course pejorative) | ||
▲ | vrighter a day ago | parent | prev [-] | |
You can shuffle a deck of 52 cards, and be reasonably confident that nobody has ever gotten that exact shuffle (or probably ever will, until the universe dies). But at least in this case, we are sure that a deck of 52 cards can be arranged in any permutation of 52 cards. We know we can reach any state from any other state. This is not the case for LLMs. We don't know what the full state space looks like. Just because the state space that LLMs (lossily) compress, is unimaginably huge, doesn't mean that you can assume that the state you want is one of them. So yeah, you might get a string of symbols that nobody has seen before, but you still have no way of knowing whether A) it's the string of symbols you wanted, and B) if it isn't, whether the string of symbols you wanted can ever be generated by the network at all. |