That’s funny, but open weight, local models are pretty usable depending on the task.

You're right, but that's also subject to compute costs and time value of money. The calculus is different for companies trying to exploit language models in some way, and different for individuals like me who have to feed the family before splurging for a new GPU, or setting up servers in the cloud, when I can get better value by paying OpenAI or Claude a few dollars and use their SOTA models until those dollars run out.

FWIW, I am a strong supporter of local models, and play with them often. It's just that for practical use, the models I can run locally (RTX 4070 TI) mostly suck, and the models I could run in the cloud don't seem worth the effort (and cost).

	▲	alwayslikethis a year ago \| parent \| next [-]
		For the money for a 4070ti, you could have bought a 3090, which although less efficient, can run bigger models like Qwen2.5 32b coder. Apparently it performs quite well for code
	▲	rjh29 a year ago \| parent \| prev [-]
		I guess the cost model doesn't work because you're buying gpu that you use about 0.1% of the day