| ▲ | josalhor 2 hours ago | |
We need LLM query routing at the OS level like Mobile data. I know it will sound crazy but hear me out. I think about this AI inference as infrastructure. I do not want to pay for it on every app I use it on. I do not think "I have to pay the mobile data of youtube, and the mobile data of whatsapp etc.". I pay Mobile data infrastructure and let my device route it appropiately. In fact, if we ever go the local llm route, you could have LLM capabilities without having access to the internet (or local LAN), and your OS/computer is the only one capable of doing that routing for you. | ||
| ▲ | solenoid0937 41 minutes ago | parent [-] | |
It doesn't sound crazy at all, this seems almost obvious. The OS should provide a chat completions server and the user should be able to select the underlying LLM's server. This should be just like selecting a default search engine or browser. Hopefully the EU forces US tech giants to do this. God knows Apple and Google won't do this on their own. They gotta get that sweet default provider revenue. | ||