Remix.run Logo
wasabi991011 5 days ago

Should be easy to test by picking two similar models with different publishing dates (before and after ARC v2), and also comparing with/without the new reasoning technique from the article.