Remix clone Hacker News

new | show | ask | jobs Github

	▲	milgrum 4 days ago
		How many TPS do you get running GPT OSS 120b on the 395+? Considering a Framework desktop for a similar use case, but I’ve been reading mixed things about performance (specifically with regards to memory bandwidth, but I’m not sure if that’s really the underlying issue)
	▲	data-ottawa 3 days ago \| parent [-]
		30-40 at 64k context, but it's a mixture of experts model. A 70b dense model is slower Qwen coder 30b Q4 runs 40+.