Remix clone Hacker News

new | show | ask | jobs Github

	▲	bravetraveler 4 hours ago
		I'm largely 'all natural', any of my little LLM usage is local. 128G Strix system, a not-super-dense Qwen or Gemma variant will get 50-80 tok/s output. Not subscribing to Anthropic/OpenAI/etc even in the unlikely event these are the last local models released; simply not needed. Entirely fine without and in-model tool usage covers my currency concerns.