▲ | mattnewton 5 days ago | ||||||||||||||||
There are many different inference libraries and it's not clear which ones a small company like mistral should back yet IMO. They do release high quality inference code, ie https://github.com/mistralai/mistral-inference | |||||||||||||||||
▲ | bastawhiz 5 days ago | parent [-] | ||||||||||||||||
There's more to it, though. The inference code you linked to is Python. Unless my software is Python, I have to ship a CPython binary to run the inference code, then wire it up (or port it, if you're feeling spicy). Ollama brings value by exposing an API (literally over sockets) with many client SDKs. You don't even need the SDKs to use it effectively. If you're writing Node or PHP or Elixir or Clojurescript or whatever else you enjoy, you're probably covered. It also means that you can swap models trivially, since you're essentially using the same API for each one. You never need to worry about dependency hell or the issues involved in hosting more than one model at a time. As far as I know, Ollama is really the only solution that does this. Or at the very least, it's the most mature. | |||||||||||||||||
|