Remix.run Logo
usagisushi 3 hours ago

If the "loop" you mean is the infinite reasoning cycle ("Wait, actually... On second thought..."), you might want to try setting a reasoning budget. For llama.cpp, use `--reasoning-budget 1024 --reasoning-budget-message "Proceed to final answer."` to force the model to reach a conclusion.

I admit I sometimes get caught up in the tooling for its own sake, but I find local models useful for specific tasks like migrating configuration schemas, writing homelab scripts, or exploring financial data.

It might sound a bit paranoid, but privacy is another major driver for me. Keeping credentials and private information off cloud services is worth the extra friction.