| ▲ | techjamie 5 hours ago | |
There's a YouTuber who makes AI Plays Mafia videos with various models going against each other. They also seemingly let past games stay in context to some extent. What people have noted is that often times chatgpt 4o ends up surviving the entire game because the other AIs potentially see it as a gullible idiot and often the Mafia tend to early eliminate stronger models like 4.5 Opus or Kimi K2. It's not exactly scientific data because they mostly show individual games, but it is interesting how that lines up with what you found. | ||
| ▲ | nodja 4 hours ago | parent | next [-] | |
https://www.youtube.com/watch?v=JhBtg-lyKdo - 10 AIs Play Mafia https://www.youtube.com/watch?v=GMLB_BxyRJ4 - 10 AIs Play Mafia: Vigilante Edition https://www.youtube.com/watch?v=OwyUGkoLgwY - 1 Human vs 10 AIs Mafia | ||
| ▲ | cpeterso 2 hours ago | parent | prev | next [-] | |
Similar: here is a YouTube video of an amusing reverse Turing test with four LLMs and a human. To make the test more interesting, the players pose as famous historical characters (Aristotle, Mozart, da Vinci, Cleopatra, and Genghis Khan) on a train in Unity 3D. | ||
| ▲ | mohsen1 an hour ago | parent | prev [-] | |
I made Mafia Arena as a way of measuring how good each LLM is at playing Mafia/Werewolves This is a good benchmark for how good AIs are at lying | ||