Remix clone Hacker News

new | show | ask | jobs Github

	▲	accrual 16 hours ago
		I wondered similar. Perhaps a local model cached in a 16GB or 24GB graphics card would perform well too. It would have to be a quantized/distilled model, but maybe sufficient, especially with some additional training as you mentioned.
	▲	jszymborski 16 hours ago \| parent \| next [-]
		If Qwen 0.6B is suitable, then it could fit in 576MB of VRAM[0]. https://huggingface.co/unsloth/Qwen3-0.6B-unsloth-bnb-4bit
	▲	otabdeveloper4 14 hours ago \| parent \| prev [-]
		16Gb is way overkill for this.