Nah, llama.cpp is stable.

llama.cpp also got GPT-OSS early, like Ollama.

There's a lot of extremely subtle politics going on in the link.

Suffice it to say, as a commercial entity, there's a very clever way to put your thumb on the scale of what works and what doesn't without it being obvious to anyone involved, even the thumb.

▲

hodgehog11 5 days ago | parent | next [-]

Stable for a power user, or stable for everyone? I don't have links on hand, but I could swear there have been instances where certain models rolled back support during llama.cpp development, and this was recent. Also llama.cpp adds features and support on a near-daily basis, how can this be LTS?

Don't get me wrong, llama.cpp is an amazing tool. But it's development is nowhere near as cautious as something like the Linux kernel, so there is room there for a more stable alternative. Not saying Ollama will do this, but llama.cpp won't be everything to everyone.

	▲	refulgentis 5 days ago \| parent [-]
		I'd start by noting all software adds features and code on a near-daily basis. (* modulo weekends and holidays and lack of interest in further development) I'm not sure comparing to Linux kernel sheds light: what is different? Just Ubuntu/Red Hat LTS type stuff? What does LTS mean in the context of not-support-contracts and not-operating systems? Steelmaning, I could say we mean....named branches? I guess a branch isn't a necessary condition...named versions?...that get fixes backported, but no new features. Software where that's a commonly used approach are at least ~3 OOMs larger (i.e. are much more separable in terms of bug fixes vs. features and components) and hard to upgrade, i.e. it's hard for IT to force all N changes on end users since the last time they upgraded Linux machines, just to get a 0 day fix. Here, it's a FOSS software library that needs to be part of an app to be useful, the consumers of the library are the ones would want to offer LTS. I'm all ears if you dig up more info on a rollback or similar nasty scandal, but as it stands, I've been involved with it near-daily for 2 years now, including CI tests on every platform you can think of, and I've never, ever, heard of such a thing. A guiding light here may be that Ollama inference is 99% llama.cpp or its consituents. From there, we notice a contradiction: if thats the case, how can we claim Ollama fulfills these ideas but llama.cpp doesn't? We could wave it away as they have a miraculous nose for what parts of llama.cpp won't fall victim to the issues we're worried about, but...well, here's one of my favorite quotes: "When faced with a contradiction, first, check your premises"

▲

mhitza 5 days ago | parent | prev [-]

llama.cpp still doesn't support gpt-oss tool calling. https://github.com/ggml-org/llama.cpp/pull/15158 (among other similar PRs)

But I also couldn't get vllm, or transformers serve, or ollama (400 response on /v1/chat/completions) working today with gpt-oss. OpenAI's cookbooks aren't really copy paste instructions. They probably tested on a single platform with preinstalled python packages which they forgot to mention :))

	▲	refulgentis 5 days ago \| parent \| next [-]
		Re: gpt-oss tool calls support, I don't think that's true, I've been using it for days. Then again, I did write my own harmony parser...(Noting for audience as you imply, neither does Ollama. Thing here is you either gotta hope all your users have nicely formed templates in their ggufs (they do not) or sometimes step in to ex. here, note the OpenAI chat completions-alike API llama.cpp provides will output a text response that you'll need to parse into a tool call yourself, until they implement a harmony parser)
	▲	electroglyph 5 days ago \| parent \| prev [-]
		gpt-oss are still being actively fixed right this moment, and there have already been quite a few fixes.