Remix clone Hacker News

new | show | ask | jobs Github

	▲	hn_acc1 8 hours ago
		What about RTX 3080? Too little VRAM?
	▲	roadside_picnic 8 hours ago \| parent [-]
		In addition to models getting better, the quantization methods have also got much better. If you already have an RTX 3080 it's absolutely worth the time to just mess around and see how it does, experiment with different quants that fit in your VRAM. If you're purchasing I would recommend coughing up the extra cash for the 3090. If you are experimenting it's worth mentioning that the harness/tooling is very important to getting a solid experience. Herme's agent is great for running helpful agents and OpenWeb UI can get really make the experience feel on par with paid chat interfaced. A reasonable halfway step is to pay for an open model through the provider or open router. You'll get many of the benefits (especially around pricing) without needing to shell out on hardware before deciding if you like the way these models work.