▲ | milgrum 4 days ago | |
How many TPS do you get running GPT OSS 120b on the 395+? Considering a Framework desktop for a similar use case, but I’ve been reading mixed things about performance (specifically with regards to memory bandwidth, but I’m not sure if that’s really the underlying issue) | ||
▲ | data-ottawa 3 days ago | parent [-] | |
30-40 at 64k context, but it's a mixture of experts model. A 70b dense model is slower Qwen coder 30b Q4 runs 40+. |