Ollama runs a web server that you use to interact with the models: https://docs.ollama.com/quickstart
You can also use the kubernetes operator to run them on a cluster: https://ollama-operator.ayaka.io/pages/en/