Remix.run Logo
mashygpig 2 hours ago

It's fun to run a model locally, but I don't think the economics make sense for anyone just trying to use models atm. It's absurdly cheap to use the same model via openrouter in comparison.

Seriously, just put $10 into openrouter and play with models that are cheap but bigger than what you'd reasonably be able to run locally like deepseek v4 flash (unquantized). You'll be surprised by how far that $10 goes for a model better than what you'd be able to run. Even further on the model you would be able to run locally. Then think of how many long it would take to match the cost of spend + power on doing it locally...

Saris an hour ago | parent | next [-]

Even with deepseek v4 flash I burned though $5 in credits in a day just playing around with Hermes, and qwen 3.6 35B is significantly more expensive.

I can run qwen 3.6 35B on my gaming PC at around 50 tok/s and other than power cost of a tiny bit extra per month, it's hardware I already owned from years ago.

I'm not really sure why qwen 3.6 35B is so expensive on openrouter, it seems abnormally high for what hardware it takes to run it.

SchemaLoad an hour ago | parent | prev [-]

Agreed, I'm waiting for the time when 48GB+ ram is just the standard that computers come with rather than being the absolute top tier option. It just doesn't make sense to spend extra on a local AI computer right now when the same money would last for a decade of API pricing.