Remix clone Hacker News

new | show | ask | jobs Github

	▲	whymauri 3 days ago
		The most direct, non-marketing, non-aesthetic summary is that this model trades off a few points on 'fundamental benchmarks' (GPQA, MATH/AIME, MMLU) in exchange for being a 'more steerable' (less refusals) scaffold for downstream tuning. Within that framing, I think it's easier to see where and how the model fits into the larger ecosystem. But, of course, the best benchmark will always be just using the model.