Remix.run Logo
karimf 2 hours ago

> Do you think it the models you’re using could be quantized more that they could be downloaded on first run using Background Assets?

I first tried the Qwen 3.5 0.8B Q4_K_S and the model couldn't hold a basic conversation. Although I haven't tried lower quants on 2B.

I'm also interested on the Apple Foundation models, and it's something I plan to try next. AFAIK it's on par with Qwen-3-4B [0]. The biggest upside as you alluded to is that you don't need to download it, which is huge for user onboarding.

[0] https://machinelearning.apple.com/research/apple-foundation-...

Patrick_Devine 4 minutes ago | parent [-]

Try it with mxfp8 or bf16. It's a decent model for doing tool calling, but I wouldn't recommend using it with 4 bit quantization.