Remix.run Logo
milgrum 4 days ago

How many TPS do you get running GPT OSS 120b on the 395+? Considering a Framework desktop for a similar use case, but I’ve been reading mixed things about performance (specifically with regards to memory bandwidth, but I’m not sure if that’s really the underlying issue)

data-ottawa 3 days ago | parent [-]

30-40 at 64k context, but it's a mixture of experts model.

A 70b dense model is slower

Qwen coder 30b Q4 runs 40+.