Wrong layer. Vulkan is a graphics and compute API, while Lemonade is an LLM server, so comparing them makes about as much sense as comparing sockets to nginx. If your goal is to run local models without writing half the stack yourself, compare Lemonade to Ollama or vLLM.

▲

metalliqaz 7 hours ago | parent [-]

I was talking about ROCm vs Vulkan. On AMD GPUs, Vulkan has been commonly recognized as the faster API for some time. Both have been slower than CUDA due to most of the hosting projects focusing entirely on Nvidia. Parent post seemed to indicate that newer ROCm releases are better.

	▲	naasking 6 hours ago \| parent [-]
		Yes, Vulkan is currently faster due to some ROCm regressions: https://github.com/ROCm/ROCm/issues/5805#issuecomment-414161... ROCm should be faster in the end, if they ever fix those issues.