| ▲ | dakolli 6 hours ago | |
Genuinely curious, what are you "fine tuning" these smaller models to do reliably? I hear this talked about a lot but very few people actually cough up examples, and I'd love to actually hear of one. | ||
| ▲ | disiplus 5 hours ago | parent [-] | |
depends, a super small one finetuned to do function calling instead sending it to big model and waiting, instead, you ask for a revenue in last month, i do a small llm function call -> show results. some bigger ones, analysis, summary, classification. what is great with smaller ones, and im looking at 2b, 4b is you can get a huge throughput with just vllm and a couple of consumer gpus. what i usually do is basically distillation of a big one onto smaller one. | ||