Remix clone Hacker News

new | show | ask | jobs Github

	▲	htrp 6 hours ago
		Trinity Nano Preview: 6B parameter MoE (1B active, ~800M non-embedding), 56 layers, 128 experts with 8 active per token Trinity Mini: 26B parameter MoE (3B active), fully post-trained reasoning model They did pretraining on their own and are still training the large version on 2048 B300 GPUs