Remix clone Hacker News

new | show | ask | jobs Github

	▲	karimf 2 hours ago
		> Do you think it the models you’re using could be quantized more that they could be downloaded on first run using Background Assets? I first tried the Qwen 3.5 0.8B Q4_K_S and the model couldn't hold a basic conversation. Although I haven't tried lower quants on 2B. I'm also interested on the Apple Foundation models, and it's something I plan to try next. AFAIK it's on par with Qwen-3-4B [0]. The biggest upside as you alluded to is that you don't need to download it, which is huge for user onboarding. [0] https://machinelearning.apple.com/research/apple-foundation-...
	▲	Patrick_Devine 4 minutes ago \| parent [-]
		Try it with mxfp8 or bf16. It's a decent model for doing tool calling, but I wouldn't recommend using it with 4 bit quantization.