Ask HN: Not much interest in offline CPU LLMs that may kill SaaS?

	▲	Ask HN: Not much interest in offline CPU LLMs that may kill SaaS?
		2 points by adinhitlore 12 hours ago \| 2 comments
		Hypothetically speaking a .bin model file <4 GB can be very competitive, the rest is just interface like a desktop app. This may be: 1. free; 2. solves the privacy issue so you communicate with your chatbot offline, lack of privacy is huge issue amongst end users. 3. crazy but even in 2026 people don't always have internet everywhere. This turns the LLM industry from SaaS subscription to a "download gta 4" kind of business model, like a steam video game business. I know there are many open source models, but they're designed to be 'build yourself and train them on your corpus' kind of solution...it's for professionals, not a ready working solution. or even better...you can follow the money (aka bill gate's case back in the days_: contact big companies like intel, hp, lenovo etc to embed your AI into their hardware, so they market it as "for free offline AI assistant" while your contract with the company gives you millions, the vast majority of people have no idea what "45% on humanity last exam" means so even if your model isn't gemini 3.1 pro it will be considered a plus, and if it is then even better since it means people get something better without the hassle of paying online to sites they didn't know exist yesterday. It's also option for IoT devices like watches, scooters, cars, even the fridge - again i'm sure this is a thing but almost anyone these days relies on SaaS even if you download the chatgpt app or use copilot on windows - internet connection is needed and the model is server-side not on your machine.
	▲	GeoSys a few seconds ago \| parent \| next [-]
		I think there're a few problems, unfortunately: - It wouldn't quite work on mobile devices. Many ppl wouldn't like to download 4GB on their iPhone; - Many use cases involve real-time data - e.g. summary of the latest news and events. That would require the LLM to be updated often and be able to perform actions like Google searches; - Switching devices will lose the history, created artefacts, etc. While I think there're use cases, IMHO, they would be mostly for tech experts and not the wider audience.
	▲	KetoManx64 5 hours ago \| parent \| prev [-]
		I think there's not much interest currently because of how inexpensive online LLM models are. If it costs me $0.001 per message, what's the point of me running a model locally other than wanting to work on something you don't want logged. Once the AI companies start running out of money and start raising the prices for the models there will be a large migration of users and companies wanting to self host.