| ▲ | esperent 11 hours ago | ||||||||||||||||
I added a hook to disable rm, find - delete, and a few of the other more obvious destructive ops. It sends Claude a strongly worded message: "STOP IMMEDIATELY. DO NOT TRY TO FIND WORKAROUNDS...". It works well. Git rm is still allowed. | |||||||||||||||||
| ▲ | Diti 9 hours ago | parent | next [-] | ||||||||||||||||
I added something similar. Claude eventually ran a `rm -rf *´ on my own project. When I asked why it did that, it recognized it messed up and offered a very bad “apology”: “the irony of not following your safety instructions isn’t lost on me”. Nowadays I only run Claude in Plan mode, so it doesn’t ask me for permissions any more. | |||||||||||||||||
| ▲ | lxgr 4 hours ago | parent | prev [-] | ||||||||||||||||
It works well so far, for you. Are you confident it would still work against sophisticated prompt injection attacks that override your "strongly worded message"? Strongly worded signs can be great for safety (actual mechanisms preventing undesirable actions from being taken are still much better), but are essentially meaningless for security. | |||||||||||||||||
| |||||||||||||||||