▲ | dcreater 7 days ago | ||||||||||||||||||||||
But Ollama is a toy, it's meaningful for hobbyists and individuals to use locally like myself. Why would it be the right choice for anything more? AWS, vLLM, SGLang etc would be the solutions for enterprise I knew a startup that deployed ollama on a customers premises and when I asked them why, they had absolutely no good reason. Likely they did it because it was easy. That's not the "easy to use" case you want to solve for. | |||||||||||||||||||||||
▲ | mchiang 7 days ago | parent | next [-] | ||||||||||||||||||||||
I can say trying many inference tools after the launch, many do not have the models implemented well, and especially OpenAI’s harmony. Why does this matter? For this specific release, we benchmarked against OpenAI’s reference implementation to make sure Ollama is on par. We also spent a significant amount of time getting harmony implemented the way intended. I know vLLM also worked hard to implement against the reference and have shared their benchmarks publicly. | |||||||||||||||||||||||
▲ | jnmandal 7 days ago | parent | prev | next [-] | ||||||||||||||||||||||
Honestly, I think it just depends. A few hours ago I wrote I would never want it for a production setting but actually if I was standing something up myself and I could just download headless ollama and know it would work. Hey, that would also be fine most likely. Maybe later on I'd revisit it from a devops perspective, and refactor deployment methodology/stack, etc. Maybe I'd benchmark it and realize its fine actually. Sometimes you just need to make your whole system work. We can obviously disagree with their priorities, their roadmap, the fact that the client isn't FOSS (I wish it was!), etc but no one can say that ollama doesn't work. It works. And like mchiang said above: its dead simple, on purpose. | |||||||||||||||||||||||
| |||||||||||||||||||||||
▲ | 7 days ago | parent | prev [-] | ||||||||||||||||||||||
[deleted] |