| ▲ | zone411 4 hours ago | |
For people interested in these kinds of benchmarks, I have two multiplayer, multi-round games: - Elimination Game Benchmark: Social Reasoning, Strategy, and Deception in Multi-Agent LLM Dynamics at https://github.com/lechmazur/elimination_game/ - Step Race Benchmark: Assessing LLM Collaboration and Deception Under Pressure at https://github.com/lechmazur/step_game/ | ||