I know it's just a quick test, but llama 3.1 is getting a bit old. I would have liked to see a newer model that can fit, such as gpt-oss-120, (gpt-oss-120b-mxfp4.gguf), which is about 60gb of weights (1).

(1) https://github.com/ggml-org/llama.cpp/discussions/15396

▲

eurekin an hour ago | parent [-]

Correct, most of r/LocalLlama moved onto next gen MoE models mostly. Deepseek introduced few good optimizations that every new model seems to use now too. Llama 4 was generally seen as a fiasco and Meta haven't made a release since

	▲	fragmede 18 minutes ago \| parent [-]
		What are some of the models people are using? (Rather than naming the ones they aren't.)