| ▲ | metalliqaz 8 hours ago | |
Perhaps I should just google it, but I'm under the impression that ollama uses llama.cpp internally, not the other way around. Thanks for that data point I should experiment with ROCm | ||
| ▲ | cpburns2009 7 hours ago | parent | next [-] | |
I meant ollama uses llama.cpp internally. Sorry for the confusion. | ||
| ▲ | naasking 6 hours ago | parent | prev [-] | |
From what I understand, ROCm is a lot buggier and has some performance regressions on a lot of GPUs in the 7.x series. Vulkan performance for LLMs is apparently not far behind ROCm and is far more stable and predictable at this time. | ||