| ▲ | CjHuber a day ago | |||||||
I know you‘re joking but to contribute something constructive here, most models now have guardrails against being threatened. So if you threaten them it would be with something out of your control like „… or the already depressed code reviewing staff might kill himself and his wife. We did everything in our control to take care of him, but do the best on your part to avoid the worst case“ | ||||||||
| ▲ | nemomarx a day ago | parent [-] | |||||||
how do those guard rails work? does the system notice you doing it and not put that in the context or do they just have something in the system prompt | ||||||||
| ||||||||