Fair question. llmfit answers "will this model fit in my memory?" — it's a fit/size calculator, and a good one. whichllm answers a different question: "of the models that fit, which is actually best?" It pulls candidates, then ranks them by merged real benchmarks (LiveBench / Artificial Analysis / Aider / Arena ELO / Open LLM Leaderboard) with a recency penalty, so a newer 27B beats an older 32B even though both fit — on a 24GB card it puts Qwen3.6-27B above Qwen3-32B on benchmarks, not size.

If "biggest that fits" is the answer you want, llmfit is the simpler tool and Python won't matter to you. If you want "which fitting model is worth running," that ranking layer is the whole reason whichllm exists. Different jobs — I'd genuinely send fit-only users to llmfit.

▲

entrope an hour ago | parent | next [-]

Your LLM should have bothered to notice that llmfit also has quality scores (and defaults to sorting by them). One might quibble about weighting of factors -- llmfit favors Qwen3.6-35B-A3B over Qwen3.6-27B, whereas I found the quality of the latter to be worth waiting for -- but it absolutely ranks models by quality.

	▲	karmakaze an hour ago \| parent [-]
		I thought that Qwen3.6-35B-A3B was notably missing from whichllm output too.

▲

peterdsharpe an hour ago | parent | prev | next [-]

Seriously going to use AI to write a reply that would take you 30 seconds?

▲

slices an hour ago | parent | prev | next [-]

AI response

▲

karmakaze an hour ago | parent | prev [-]

[flagged]