▲ | diggan 9 days ago | |
Maybe I'm spoiled by having great internet connection, but I usually download the weights and try to run them via various tools (llama.cpp, LM Studio, vLLM and SGLang typically) and see what works. There seems to be so many variables involved (runners, architectures, implementations, hardware and so on) that none of the calculators I've tried so far been accurate, both in the way that they've over-estimated and under-estimated what I could run. So in the end, trying to actually run them seems to be the only fool-proof way of knowing for sure :) |