| ▲ | WhitneyLand 6 hours ago | |||||||
Yeah important conceptually to remember MTP is kind of just more weights, but speculative decoding is the runtime algorithm that’s a significant add to whatever code is serving the model. | ||||||||
| ▲ | HumanOstrich 5 hours ago | parent [-] | |||||||
That is.. inaccurate. | ||||||||
| ||||||||