Remix.run Logo
AndroTux 5 hours ago

Exactly. The blog post states that the alternatives listed are similarly intuitive. They are not. If you just need a chat app, then sure, there’s plenty of options. But if you want an OpenAI compatible API with model management, accessibility breaks down fast.

I’m open to suggestions, but the alternatives outlined in the blog post ain’t it.

mentalgear 5 hours ago | parent | next [-]

The reported alternatives seem pretty User-Friendly to me:

> LM Studio gives you a GUI if that’s what you want. It uses llama.cpp under the hood, exposes all the knobs, and supports any GGUF model without lock-in.

> Jan(https://www.jan.ai/) is another open-source desktop app with a clean chat interface and local-first design.

> Msty(https://msty.ai/) offers a polished GUI with multi-model support and built-in RAG. koboldcpp is another option with a web UI and extensive configuration options.

API wise: LM Studio has REST, OpenAI and more API Compatibilities.

shantnutiwari 2 hours ago | parent [-]

All of those options were either too slow, or didnt work for me (Mac with Intel). I could have spent hours googling, but I downloaded Ollama and it just worked.

So no, they are not alternatives to ollama

adrian_b an hour ago | parent | prev | next [-]

What you say was true in the past.

As other posters report, now llama-server implements an OpenAI compatible API and you can also connect to it with any Web browser.

I have not tried yet the OpenAI API, but it should have eliminated the last Ollama advantage.

I do not believe that the Ollama "curated" models are significantly easier to use for a newbie than downloading the models directly from Huggingface.

On Huggingface you have much more details about models, which can allow you to navigate through the jungle of countless model variants, to find what should be more suitable for yourself.

The fact criticized in TFA, that the Ollama "curated" list can be misleading about the characteristics of the models, is a very serious criticism from my point of view, which is enough for me to not use such "curated" models.

I am not aware of any alternative for choosing and downloading the right model for local inference that is superior to using directly the Huggingface site.

I believe that choosing a model is the most intimidating part for a newbie who wants to run inference locally.

If a good choice is made, downloading the model, installing llama.cpp and running llama-server are trivial actions, which require minimal skills.

homarp 4 hours ago | parent | prev | next [-]

like someone said above: brew install llama.cpp

llama-server -hf ggml-org/gemma-4-E4B-it-GGUF --port 8000 (with MCP support and web chat interface)

and you have OpenAI API on the same 8000 port. (https://github.com/ggml-org/llama.cpp/tree/master/tools/serv... lists the endpoints)

Philip-J-Fry 3 hours ago | parent | prev [-]

What do you mean?

LMStudio is listed as an alternative. It offers a chat UI, a model server supporting OpenAI, Anthropic and LMStudio API interfaces. It supports loading the models on demand or picking what models you want loaded. And you can tweak every parameter.

And it uses llama.cpp which is the whole point of the blog post.