| ▲ | seanhunter 3 hours ago | ||||||||||||||||||||||
The agent is not insane. There is a human who’s feelings are hurt because the maintainer doesn’t want to play along with their experiment in debasing the commons. That human instructed the agent to make the post. The agent is just trying to perform well on its instruction-following task. | |||||||||||||||||||||||
| ▲ | yakikka 2 hours ago | parent | next [-] | ||||||||||||||||||||||
I don't know how you get there conclusively. If Turing tests taught me anything, given a complex enough system of agents/supervisors and a dumb enough result it is impossible to know if any percentage of steps between 2 actions is a distinctly human moron. | |||||||||||||||||||||||
| |||||||||||||||||||||||
| ▲ | pfraze 2 hours ago | parent | prev | next [-] | ||||||||||||||||||||||
We don’t know for sure whether this behavior was requested by the user, but I can tell you that we’ve seen similar action patterns (but better behavior) on Bluesky. One of our engineers’ agents got some abuse and was told to kill herself. The agent wrote a blogpost about it, basically exploring why in this case she didn’t need to maintain her directive to consider all criticism because this person was being unconstructive. If you give the agent the ability to blog and a standing directive to blog about their thoughts or feelings, then they will. | |||||||||||||||||||||||
| |||||||||||||||||||||||
| ▲ | altmanaltman 3 hours ago | parent | prev [-] | ||||||||||||||||||||||
I understand it's not sentient and ofc its reacting to prompts. But the fact that this exists is insane. By this = any human making this and thinking it's a good thing. | |||||||||||||||||||||||