Remix clone Hacker News

new | show | ask | jobs Github

	▲	woadwarrior01 2 days ago
		The most salient thing about these models is that they're non-reasoning models. This makes then very token efficient and particularly well suited for local inference where decoding is usually slower than with datacenter GPUs. Link to HF collection: https://huggingface.co/collections/ibm-granite/granite-41-la...
	▲	lostmsu 20 hours ago \| parent [-]
		Probably worse than Gemma 4 or Qwen 3.6 with thinking off.