Nice one! Let's say I'm serving local models via vllm (because ollama comes with huge performance hits), how would I implement that in gomodel?

▲ devmor 3 hours ago | parent [-]

This is way more interesting to me as well. I have projects that use small limited-purpose language models that run on local network servers and something like this project would be a lot simpler than manually configuring API clients for each model in each project.

	▲	santiago-pl 3 hours ago \| parent [-]
		Thanks for raising it! Since vLLM has an OpenAI-compatible API, this should work for now: `docker run --rm -p 8080:8080 \ -e OPENAI_API_KEY="some-vllm-key-if-needed" \ -e OPENAI_BASE_URL="http://host.docker.internal:11434/v1" \ ... enterpilot/gomodel` I'll add a more convenient way to configure it in the coming days.