| ▲ | Jackobrien 9 hours ago | |
I see a world soon where there’s an extremely wide variety of small models for speculative decoding, unique to use cases, companies, and even individuals. | ||
| ▲ | nicce 9 hours ago | parent | next [-] | |
Hopefully that is the case and hardware does not get impossible to get. | ||
| ▲ | pydry 9 hours ago | parent | prev | next [-] | |
yes, heavily constrained by sophisticated guardrails. this is definitely where things are going. the enormous "eat the world" models have extreme diminishing returns by comparison. | ||
| ▲ | Der_Einzige 4 hours ago | parent | prev [-] | |
You clearly didn't read the recent speculative decoding papers because it's been possible to use any model to speculate for any other model for awhile. They solved the tokenization problems that prevented this in the past. | ||