I wonder how well suited some of the smaller LLMs like Qwen 0.6B would be suited to this... it doesn't sound like a super complicated task.

I also feel like you can train a model on this task by using the zero-shot performance of larger models to create a dataset, making something very zippy.

▲

accrual 16 hours ago | parent [-]

I wondered similar. Perhaps a local model cached in a 16GB or 24GB graphics card would perform well too. It would have to be a quantized/distilled model, but maybe sufficient, especially with some additional training as you mentioned.

	▲	jszymborski 16 hours ago \| parent \| next [-]
		If Qwen 0.6B is suitable, then it could fit in 576MB of VRAM[0]. https://huggingface.co/unsloth/Qwen3-0.6B-unsloth-bnb-4bit
	▲	otabdeveloper4 13 hours ago \| parent \| prev [-]
		16Gb is way overkill for this.