| ▲ | hasperdi 5 hours ago | ||||||||||||||||||||||
and can be faster if you can get an MOE model of that | |||||||||||||||||||||||
| ▲ | dormento 4 hours ago | parent | next [-] | ||||||||||||||||||||||
"Mixture-of-experts", AKA "running several small models and activating only a few at a time". Thanks for introducing me to that concept. Fascinating. (commentary: things are really moving too fast for the layperson to keep up) | |||||||||||||||||||||||
| |||||||||||||||||||||||
| ▲ | miohtama 3 hours ago | parent | prev | next [-] | ||||||||||||||||||||||
All modern models are MoE already, no? | |||||||||||||||||||||||
| ▲ | bigyabai 4 hours ago | parent | prev [-] | ||||||||||||||||||||||
>90% of inference hardware is faster if you run an MOE model. | |||||||||||||||||||||||