| ▲ | ramoz 3 hours ago | |||||||
The more concerning algorithms at play are how they are post-trained. And the then concern of reward hacking. Which is what he was getting at. https://en.wikipedia.org/wiki/Reward_hacking 100% - we really shouldn't anthropomorphize. But the current models are capable of being trained in a way to steer agentic behavior from reasoned token generation. | ||||||||
| ▲ | AdieuToLogic 2 hours ago | parent [-] | |||||||
> But the current models are capable of being trained in a way to steer agentic behavior from reasoned token generation. This does not appear to be sufficient in the current state, as described in the project's README.md:
Perhaps one day this category of plugin will not be needed. Until then, I would be hard-pressed to employ an LLM-based product having destructive filesystem capabilities based solely on the hope of them "being trained in a way to steer agentic behavior from reasoned token generation." | ||||||||
| ||||||||