Remix clone Hacker News

new | show | ask | jobs Github

▲

Filligree 2 days ago

It sometimes works.

▲

wkat4242 2 days ago | parent [-]

How so? It's rock solid for me. I use ollama but it's based on llama.cpp

It's quite fast also, probably because that card has fast HBM2 memory (it has the same memory bandwidth as a 4090). And it was really cheap as it was on deep sale as an outgoing model.

▲

Filligree 2 days ago | parent | next [-]

"Sometimes" as in "on some cards". You're having luck with yours, but that doesn't mean it's a good place to build a community.

	▲	wkat4242 2 days ago \| parent [-]
		Ah I see. Yes, but you pick the card for the purpose of course. I also don't like the way they have such limited support on ROCm. But when it works it works well. I have Nvidia cards too by the way, a 4090 and a 3060 (the latter I use for AI also, but more for Whisper because faster-whisper doesn't do ROCm right now).

▲

halJordan 2 days ago | parent | prev [-]

Aside from the fact that gfx906 is one of the blessed architecture mentioned (so why would it not work). Like how do you look at your specific instance and then turn around and say "All of you are lying, it works perfectly." How do you square that circle in your head

	▲	wkat4242 a day ago \| parent [-]
		No I was just a bit thrown by the "sometimes". I thought they were referring to a reliability issue. I am aware of the limited card support with ROCm and I complained about this elsewhere in the thread too. Also I didn't accuse anyone of lying. No need to be so confrontational. And my remark to the original poster at the top was from before they clarified their post. I just don't really see what AMD can do to make ollama work better other than porting ROCm to all their cards which is definitely something they should do. And no I'm not an AMD fanboi. I have no loyalty to anyone, any company or any country.