llama.cpp and llama-swap do this better than Ollama and with far more control.
Don't even need to use llama-swap anymore now that llama-server supports the same functionality.