check on same port, there is an OpenAI API https://github.com/ggml-org/llama.cpp/tree/master/tools/serv...
Good stuff, thanx!