More than one year in and Ollama still doesn't support Vulkan inference. Vulkan is essential for consumer hardware. Ollama is a failed project at this point: https://news.ycombinator.com/item?id=42886680

▲

zozbot234 6 days ago | parent [-]

There's an open pull request https://github.com/ollama/ollama/pull/9650 but it needs to be forward ported/rebased to the current version before the maintainers can even consider merging it.

Also realistically, Vulkan Compute support mostly helps iGPU's and older/lower-end dGPU's, which can only bring a modest performance speed up in the compute-bound preprocessing phase (because modern CPU inference wins in the text-generation phase due to better memory bandwidth). There are exceptions such as modern Intel dGPU's or perhaps Macs running Asahi where Vulkan Compute can be more broadly useful, but these are also quite rare.

▲

buyucu 6 days ago | parent [-]

That pull request has been open for more than a year. The owner rebased multiple times but eventually gave up because Ollama devs just don't care.

▲

zozbot234 6 days ago | parent [-]

That's not a helpful point of view. It's the contributors' job to keep a pull request up to date as the codebase evolves, a maintainer is under no obligation to accept a PR that has long become out of date and unmergeable.

	▲	buyucu 5 days ago \| parent [-]
		The PR was in good shape. Ollama devs ignored it, and the original author rebased it multiple times. Since Ollama devs don't care, he just gave up after a while. Ollama is in a very sad state. The project is dysfunctional.