The long term is unlimited access to local LLMs that are better than 2025’s best cloud models and good enough for 99% of your needs, and limited access to cloud models for when you need to bring more intelligence to bear on a problem.

LLMs will become more efficient, GPUs, memory and storage will continue to become cheaper and more commonplace. We’re just in the awkward early days where things are still being figured out.

▲

pakitan 5 days ago | parent [-]

I'm often using LLMs for stuff that requires recent data. No way I'm running a web crawler in addition to my local LLM. For coding it could theoretically work as you don't always need latest and greatest but would still make me anxious.

	▲	data-ottawa 5 days ago \| parent [-]
		That’s a perfect use case with MCP though. My biggest issue is local models I can run on my m1/m4 mbp are not smart enough to use tools consistently, and the context windows are too small for iterative uses. The last year has seen a lot of improvement in small models though (gemma 3n is fantastic), so hopefully it’s only a matter of time.