▲ | acters 9 days ago | ||||||||||||||||
Personally, I think bigger companies should be more proactive and work with some of the popular inference engine software devs with getting their special snowflake LLM to work before it gets released. I guess it is all very much experimental at the end of the day. Those devs are putting in God's work for us to use on our budget friendly hardware choices. | |||||||||||||||||
▲ | mutkach 9 days ago | parent | next [-] | ||||||||||||||||
This is a good take, actually. GPT-OSS is not much of a snowflake (judging by the model's architecture card at least) but TRT-LLM treats every model like that - there is too much hardcode - which makes it very difficult to just use it out-of-the-box for the hottest SotA thing. | |||||||||||||||||
| |||||||||||||||||
▲ | diggan 9 days ago | parent | prev | next [-] | ||||||||||||||||
This is literally what they did for GPT-OSS, seems there was coordination to support it on day 1 with collaborations with OpenAI | |||||||||||||||||
▲ | eric-burel 9 days ago | parent | prev [-] | ||||||||||||||||
SMEs are starting to want local LLMs and it's a nightmare to figure what hardware would work for what models. I am asking devs in my hometown to literally visit their installs to figure combos that work. | |||||||||||||||||
|