Remix.run Logo
clmnt 10 hours ago

Introducing Gaia2, the follow-up to the agentic benchmark GAIA, allowing analysis of considerably more complex behaviors. Gaia2 is released with the open Meta Agents Research Environments (ARE) framework to run, debug and evaluate agents. ARE simulates complex real world-like conditions and can be customized to further study agents behaviors. Gaia2 dataset is released under CC by 4.0 license, and ARE under MIT license.