Remix clone Hacker News

new | show | ask | jobs Github

	▲	jbellis 2 hours ago
		M2 was one of the most benchmaxxed models we've seen. Huge gap between SWE-B results and tasks it hasn't been trained on. We'll put 2.5 on the list. https://brokk.ai/power-ranking