Remix clone Hacker News

new | show | ask | jobs Github

	▲	Flere-Imsaho 6 hours ago
		Yeah the future is probably a number of highly specialised small models you can run on your own hardware rather than massive frontier models in the cloud. That's what I'm betting on anyway.
	▲	girvo 3 hours ago \| parent \| next [-]
		Step 3.7 Flash on my Asus GB10 based mini pc is incredibly close to that today. I’m very impressed, and that’s without MTP to boost performance
	▲	thewebguyd 6 hours ago \| parent \| prev \| next [-]
		That seems to be what Microsoft is betting on also based on what was shown at the BUILD keynote today + that new surface ultra and the surface mini PC with the new Nvidia chip. Nadella really played up local AI as the main use case they have in mind.
	▲	search_facility 6 hours ago \| parent \| prev [-]
		MOE basically work that way already, QWEN/etc with low active params (A-number in name) allows to inference big models locally (only active params have to fit into memory)