| ▲ | zmmmmm 4 hours ago | |||||||||||||||||||||||||||||||||||||||||||
What can it actually run? The fact their benchmark plot refers to Llama 3.1 8b signals to me that it's hand implemented for that model and likely can't run newer / larger models. Why else would you benchmark such an outdated model? Show me a benchmark for gpt-oss-120b or something similar to that. | ||||||||||||||||||||||||||||||||||||||||||||
| ▲ | sanxiyn 4 hours ago | parent | next [-] | |||||||||||||||||||||||||||||||||||||||||||
Looking at their blog, they in fact ran gpt-oss-120b: https://furiosa.ai/blog/serving-gpt-oss-120b-at-5-8-ms-tpot-... I think Llama 3 focus mostly reflects demand. It may be hard to believe, but many people aren't even aware gpt-oss exists. | ||||||||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||||||||
| ▲ | rjzzleep 3 hours ago | parent | prev [-] | |||||||||||||||||||||||||||||||||||||||||||
The fact that so many people are focusing solely on massive LLM models is an oversight by people that narrowly focusing on a tiny (but very lucrative) subdomain of AI applications. | ||||||||||||||||||||||||||||||||||||||||||||