| ▲ | hrmtst93837 3 days ago | |||||||
Most privacy talk folds on contact with a quote. Latency and convenience beat philosophy fast once someone wants a dashboard next week, and a lot of "data sensitivity" talk is just the corporate version of buying "organic" food until the price tag shows up. If private inference is actually non-negotiable, then sure, put GPUs in your colo and enjoy the infra pain, vendor weirdness, and the meeting where finance learns what those power numbers meant. | ||||||||
| ▲ | zozbot234 2 days ago | parent [-] | |||||||
The real case for private inference is not "organic", it's "slow food". Offering slow-but-cheap inference is an afterthought for the big model providers, e.g. OpenRouter doesn't support it, not even as a way of redirecting to existing "batched inference" offerings. This is a natural opening for local AI. | ||||||||
| ||||||||