| ▲ | ekidd 2 days ago | ||||||||||||||||||||||||||||||||||||||||||||||||||||
One easy way to test different models is purchase $20 worth of tokens from one of the Open Router-like sites. This will let you asks tons of questions and try out lots of models. Realistically, the biggest models you can run at a reasonable price right now are quantized versions of things like the Qwen3 30B A3B family. A 4-bit quantized version fits in roughly 15GB of RAM. This will run very nicely on something like an Nvidia 3090. But you can also use your regular RAM (though it will be slower). These models aren't competitive with GPT 5 or Opus 4.5! But they're mostly all noticeably better than GPT-4o, some by quite a bit. Some of the 30B models will run as basic agentic coders. There are also some great 4B to 8B models from various organizations that will fit on smaller systems. A 8B model, for example, can be a great translator. (If you have a bunch of money and patience, you can also run something like GPT OSS 120B or GLM 4.5 Air locally.) | |||||||||||||||||||||||||||||||||||||||||||||||||||||
| ▲ | nl 2 days ago | parent | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||||||||
I wrote https://tools.nicklothian.com/llm_comparator.html so you can compare different models. OpenRouter gives you $10 credit when you sign up - stick your API key in and compare as many models as you want. It's all browser local storage. | |||||||||||||||||||||||||||||||||||||||||||||||||||||
| ▲ | kouteiheika 2 days ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||||||||
> (If you have a bunch of money and patience, you can also run something like GPT OSS 120B or GLM 4.5 Air locally.) Don't need patience for these, just money. A single RTX 6000 Pro runs those great and super fast. | |||||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||||
| ▲ | cmrdporcupine 2 days ago | parent | prev [-] | ||||||||||||||||||||||||||||||||||||||||||||||||||||
This is the answer. There's a half dozen sites that let you run these models by the token, and actually $20 is excessive. $5 will get you a long long way. | |||||||||||||||||||||||||||||||||||||||||||||||||||||