Remix clone Hacker News

new | show | ask | jobs Github

	▲	basscodes 9 hours ago
		Hey everyone! At eno we've been using local models in our evals to benchmark our harness and with the recent releases of some of these open-weight models like Qwen3.6 we've found that they are only a couple points off in accuracy compared to the cloud SOTA models we use. We thought that was pretty incredible, so we decided to expose the provider configuration to users so they can bring their local models to use with our application. I've been running this using an RTX5090, which has been working quite well with Qwen models but I've been struggling to get Gemma 4 to run with it reliably. Would love to hear about any other models we should try out, or suggestions for which models might be best for running an agentic search workflow.