▲ | jszymborski 16 hours ago | |||||||||||||
I wonder how well suited some of the smaller LLMs like Qwen 0.6B would be suited to this... it doesn't sound like a super complicated task. I also feel like you can train a model on this task by using the zero-shot performance of larger models to create a dataset, making something very zippy. | ||||||||||||||
▲ | accrual 16 hours ago | parent [-] | |||||||||||||
I wondered similar. Perhaps a local model cached in a 16GB or 24GB graphics card would perform well too. It would have to be a quantized/distilled model, but maybe sufficient, especially with some additional training as you mentioned. | ||||||||||||||
|