▲ | leopoldj 6 days ago | |
For production use of open weight models I'd use something like Amazon Bedrock, Google Vertex AI (which uses vLLM), or on-prem vLLM/SGLang. But for a quick assessment of a model as a developer, Ollama Turbo looks appealing. I find Google GCP incredibly user hostile and a nightmare to navigate quotas and stuff. |