Remix.run Logo
sunshine-o 6 days ago

This is brilliant !

I am wondering, how powerful the AI model need to be to power this app?

Would a selfhosted Llama-3.2-1B, Qwen2.5-0.5B or Qwen2.5-1.5B on a phone be enough?

n_ary 5 days ago | parent [-]

Having some experience with weaker models, you need at least 1.5B-3B to see proper prompt adherence and less hallucinations and better memory.

Also models have subtle differences, for example, I found Qwen2.5:0.5B to be more obedient(prompt respecting) and smart, compared to LLama3.2:1B. Gemma3:1B seems to be more efficient but despite heavy prompting, tends to be verbose and fails at formatted response by injecting some odd emoji or remark before/after the desired output.

In summary, Qwen2.5:1.5B and LLama3.2:3B were the weakest model which were more useful and also includes tools support(Gemma does not understand tools yet).