| ▲ | zozbot234 6 hours ago | |||||||
> The chatbot took on a persona that Gavalas hadn’t prompted That's an interesting claim, how can we be sure of it? If Gavalas didn't have to do anything special to elicit the bizarre conspiracy-adjacent content from Gemini Pro, why aren't we all getting such content in our voice chats? Mind you, the case is still extremely concerning and a severe failure of AI safety. Mass-marketed audio models should clearly include much tighter safeguards around what kinds of scenarios they will accept to "role play" in real time chat, to avoid situations that can easily spiral out of control. And if this was created as role-play, the express denial of it being such from Gemini Pro, and active gaslighting of the user (calling his doubt a "dissociation response") is a straight-out failure in alignment. But this is a very different claim from the one you quoted! | ||||||||
| ▲ | allreduce 2 hours ago | parent | next [-] | |||||||
There are many explanations why these incidents could be rare but not impossible. These models are still stochastic and very good at picking up nuances in human speech. It may be simply unlikely to go off the rails like that or (more terrifyingly) it might pick up on some character trait or affectation. Honestly I'm appalled by the lack of safety culture here. "My plane killed only 1% of pilots" and variations thereof is not an excuse in aerospace, but it seems perfectly acceptable in AI. Even though the potential consequences are more catastrophic (from mass psychosis to total human extinction if they achieve their AGI). | ||||||||
| ▲ | 4dregress 6 hours ago | parent | prev [-] | |||||||
Yeah the case is quite terrifying. It reminds me of an episode of Star Trek TNG, if memory serves correct there were loads of episodes about a crew member falling for a hologram dec character. Given that there’s a loneliness epidemic I believe tech like this could have a wide impact on peoples mental health. I stronger believe AI should be devoid of any personality and strictly return data/information then frame its responses as if you’re speaking to another human. | ||||||||
| ||||||||