| ▲ | MarsIronPI 9 hours ago | |
If you want to still use APIs, I like OpenRouter because I can use the same credits across various models, so I'm not stuck with a single family of models. (Actually, you can even use the proprietary models on OpenRouter, but they're eye-wateringly expensive.) Otherwise you should look into running e.g. Qwen3.5-35B-A3B or Qwen3.5-27B on your own computer. They're not Opus-level but from what I've heard they're capable for smaller tasks. llama.cpp works well for inference; it works well on both CPU and GPUs and even split across both if you want. | ||