Remix clone Hacker News

new | show | ask | jobs Github

	▲	roenxi 7 hours ago
		Have the open models been tried? When I look at the leaderboard [0] the only qwen model I see is 235B-A22B. I wouldn't expect an MoE model to do particularly well, from what I've seen (thinking mainly of a leaderboard trying to measure EQ [1]) MoE models are at a distinct disadvantage to regular models when it comes to complex tasks that aren't software benchmark targets. [0] https://arcprize.org/leaderboard [1] https://eqbench.com/index.html
	▲	WarmWash 5 hours ago \| parent [-]
		There is GLM 5 and kimi 2.5 (which gets 11.8%, but I digress)