Remix clone Hacker News

new | show | ask | jobs Github

	▲	namnnumbr 5 hours ago
		oMLX (https://github.com/jundot/omlx) makes running the mlx inference server quite easy for those interested in UI-based hosting. oMLX also supports mtp or dflash drafting.
	▲	w10-1 5 hours ago \| parent [-]
		Agreed (not sure what you mean by UI-based hosting). oMLX does the caching I need to fit models that are near gross memory, and it handles most of the work in finding usable models. After cobbling together various solutions over months, I now just use oMLX, often from Xcode. I can tell the difference between Gemma-4 (local/free) and Claude (paid) only on the largest tasks.