Remix clone Hacker News

new | show | ask | jobs Github

	▲	CamperBob2 3 hours ago
		Try the 27B dense model. It will likely do much better than the 35b MoE with only 3B active experts. Also, performance on research-y questions isn't always a good indicator of how the model will do for code generation or agent orchestration.