Remix clone Hacker News

new | show | ask | jobs Github

	▲	red2awn 3 days ago
		They had a Flash variant released alongside the original open weight release. It is also mentioned in Section 5 of the paper: https://arxiv.org/pdf/2509.17765 For the evals it's probably just trained on a lot of the benchmark adjacent datasets compared to the 235B model. Similar thing happened on other model today: https://x.com/NousResearch/status/1998536543565127968 (a 30B model trained specifically to do well in maths get near SOTA scores)