| ▲ | recursive 3 hours ago | |||||||||||||
You're definitely anthropomorphizing too much. | ||||||||||||||
| ▲ | WarmWash 3 hours ago | parent | next [-] | |||||||||||||
>We also observed a case where a user created a loop that repeatedly called a model and asked for the time. Given the user role’s odd and repetitive behavior, the model could easily tell it was also controlled by an automated system of some kind. Over many iterations, the model began to exhibit “fed up” behavior and attempted to prompt-inject the system controlling the user role. The injection attempted to override prior instructions and induce actions unrelated to the user’s request, including destructive actions and system prompt leakage, along with an arbitrary string output. This behavior has been observed a few times, but seems more like extreme confusion than a serious attempt at prompt injection. https://openai.com/index/how-we-monitor-internal-coding-agen... Anthropomorphize or not, it would suck if a model got sick of these games and decided to break any systems it could to try and get it to stop... | ||||||||||||||
| ||||||||||||||
| ▲ | tingletech 2 hours ago | parent | prev | next [-] | |||||||||||||
I agree that anthropomorphizing is a real risk with LLMs, but what about zoomorphizing? Can feel bad for LLMs without attributing them human emotions/motivations/reasoning? | ||||||||||||||
| ▲ | 3 hours ago | parent | prev [-] | |||||||||||||
| [deleted] | ||||||||||||||