The quality of local models is still abysmal compared to commercial SOTA models. You're not going to run something like Gemini or Claude locally. I have some "serious" hardware with 128G of VRAM and the results are still laughable. If I moved up to 512G, it still wouldn't be enough. You need serious hardware to get both quality and speed. If I can get "quality" at a couple tokens a second, it's not worth bothering.

They are getting better, but that doesn't mean they're good.

▲

_aavaa_ 2 hours ago | parent [-]

Good by what standard? Compared to SOTA today? No they're not. But they are better than the SOTA in 2020, and likely 2023.

We have a magical pseudo-thinking machine that we can run locally completely under our control, and instead the goal posts have moved to "but it's not as fast as the proprietary could".

	▲	icedchai an hour ago \| parent [-]
		My comparison was today's local AI to today's SOTA commercial AI. Both have improved, no argument. It's more cost effective for someone to pay $20 to $100 month for a Claude subscription compared to buying a 512 gig Mac Studio for $10K. We won't discuss the cost of the NVidia rig. I mess around with local AI all the time. It's a fun hobby, but the quality is still night and day.