| ▲ | prettyblocks 8 hours ago | |||||||||||||||||||||||||||||||
I think the biggest case for fine tuning is probably that you can take small models, fine tune them for applications that require structured output, and then run cheap inference at scale. "Frontier LLMs can do it with enough context" is not really a strong argument against fine-tuning, because they're expensive to run. | ||||||||||||||||||||||||||||||||
| ▲ | andriy_koval 2 hours ago | parent | next [-] | |||||||||||||||||||||||||||||||
> "Frontier LLMs can do it with enough context" is not really a strong argument against fine-tuning, because they're expensive to run. I am not expert in this topic, but I am wondering if large cached context is actually cheap to run and frontier models would be cost efficient too in such setting? | ||||||||||||||||||||||||||||||||
| ▲ | _the_inflator 3 hours ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||
I agree. Also for certain use cases there are constraints like embedded hardware systems with no internet access. These LLMs have to be trained to specialize for clearly defined use cases under hardware constraints. Frontier LLMs also are rarely function in isolation instead are orchestrating a system of special units aka subsystems and agents. While costs and effort are one thing, being able to downsize these monster LLMs through finetuning itself in the first place is extremly valuable. | ||||||||||||||||||||||||||||||||
| ▲ | faxmeyourcode 6 hours ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||
Especially for super constrained applications. I don't care if the language model that I use for my extremely specific business domain can solve PhD math or remember the works of Shakespeare. I'd trade all of that for pure task specific accuracy. | ||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||
| ▲ | derwiki 8 hours ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||
Exactly, inference cost is a very good reason to fine tune with something like Qwen | ||||||||||||||||||||||||||||||||
| ▲ | Me1000 7 hours ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||
Wouldn’t it be better to use a grammar in the token sampler? Tuning is fine, but doesn’t guarantee a syntactical correct structured output. But if the sampler is grammar aware it could. | ||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||
| ▲ | butILoveLife 8 hours ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||
This is literally what I'm waiting for. I want a ~8B model that works well with OpenClaw. | ||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||
| ▲ | throwaway6977 8 hours ago | parent | prev [-] | |||||||||||||||||||||||||||||||
I agree- I'm currently trying to learn how I can embed a fine tuned tiny model into my c++ game so it can provide a narrative in prose of certain game-event logs. It needs to be as tiny as possible so it doesn't take resources away from the running game. | ||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||