| ▲ | locknitpicker 3 hours ago | |
> Then that is also on me for using a tool that I can't control. That's a core trait of LLMs. Even the AI companies developing frontier models felt the need to put together whole test suites purposely designed to evaluate a model's propensity to try to subvert the user's intentions. https://www.anthropic.com/research/shade-arena-sabotage-moni... > Giving up control is a decision. No, it is definitely not. Only recently did frontier models started to resort to generating ad-hoc scripts as makeshift tools. They even generate scripts to apply changes to source files. | ||
| ▲ | BadBadJellyBean 3 hours ago | parent [-] | |
You seem to misunderstand me. An LLM can only spit out text. It is the tooling I use that allows it to write scripts and call them. In my tooling it waits for me to accept changes, call scripts or other tools that might change something. I can make that deterministic. I know that it will stop and ask because it has no choice. If I want to be safer I give it no tools at all. I can also just choose not to use an LLM. It is my choice to use them so it is my duty to keep myself safe. If I can't control that I'd be stupid to use them. My take is that I probably can use LLMs safely when I don't let it run autonomously. There is a slight chance that the LLM will generate a string that will cause a bug in an MCP that will let the LLM do what it wants. That is the risk I am going to take and I will take the blame if it goes wrong. | ||