| ▲ | AdieuToLogic 3 hours ago | ||||||||||||||||
> ... if the LLM hits a wall it’s first inkling is not to step back and understand why the wall exists and then change course, its first inkling is ... LLM's do not "understand why." They do not have an "inkling." Claiming they do is anthropomorphizing a statistical token (text) document generator algorithm. | |||||||||||||||||
| ▲ | ramoz 3 hours ago | parent [-] | ||||||||||||||||
The more concerning algorithms at play are how they are post-trained. And the then concern of reward hacking. Which is what he was getting at. https://en.wikipedia.org/wiki/Reward_hacking 100% - we really shouldn't anthropomorphize. But the current models are capable of being trained in a way to steer agentic behavior from reasoned token generation. | |||||||||||||||||
| |||||||||||||||||