Remix.run Logo
tarruda 7 days ago

Download llama-server from llama.cpp Github and install it some PATH directory. AFAIK they don't have an automated installer, so that can be intimidating to some people

Assuming you have llama-server installed, you can download + run a hugging face model with something like

    llama-server -hf ggml-org/gpt-oss-20b-GGUF -c 0 -fa --jinja

And access http://localhost:8080