> And if you have a local app, how do you take a dependency on whatever random model is installed?

Why not ship your own model? In the age of Electron apps, 10GB+ apps are not unheard of.

_heimdall 11 hours ago | parent | next [-]

Personally I wouldn't want a couple dozen apps installed all with their own model.

It seems easier to have industry specs that define a common interface for local models.

I also assume the OS can, or would need to, be involved in proving the models. That may not be a good thing depending on your views of OS vendors, but sharing a single local model does seem more like an OS concern.

▲

alex7o 11 hours ago | parent [-]

I mean the openai API is the industry standard for allowing apps to communicate with models, llama-server has it, oMLX has it, ollama has it, vLLM has it, lmstudio as well. I don't think this is such a hard thing to do, but it requires people to set it up.

	▲	_heimdall 11 hours ago \| parent [-]
		I don't know enough about that API surface to know if its a particularly good one for the use cases we'd have, but yes defining a universal spec for all implementors to support wouldn't be a big lift and is done in plenty of other areas already.

▲

alex7o 11 hours ago | parent | prev [-]

There is no other way than shipping your own model, because you will want an abstracted API over the inference, and you don't know what the user has installed. Also you can ship 9b fp4 model but it all just depends

	▲	_heimdall 11 hours ago \| parent \| next [-]
		Knowing what's installed would have to be an OS API. If LLMs provide a standard API surface to the OS, likely including metadata related to feature support.
	▲	LPisGood 11 hours ago \| parent \| prev [-]
		You can know what the user has installed if the OS developer offers something.