| ▲ | thrtythreeforty 3 days ago |
| This ticket, finally closed after being open for 2 years, is a pretty good micocosm of this problem: https://github.com/ROCm/ROCm/issues/1714 Users complaining that the docs don't even specify which cards work. But it goes deeper - a valid complaint is that "this only supports one or two consumer cards!" A common rebuttal is that it works fine on lots of AMD cards if you set some environment flag to force the GPU architecture selection. The fact that this is so close to working on a wide variety of hardware, and yet doesn't, is exactly the vibe you get with the whole ecosystem. |
|
| ▲ | iforgotpassword 3 days ago | parent | next [-] |
| What I don't get is why they don't at least assign a dev or two to make the poster child of this work: llama.cpp It's the first thing anyone tries when trying to dabble in AI or compute on the gpu, yet it's a clusterfuck to get to work. A few blessed cards work, with proper drivers and kernel; others just crash, perform horribly slow, or output GGGGGGGGGGGGGG to every input (I'm not making this up!) Then you LOL, dump it and go buy nvidia et voila, stuff works first try. |
| |
| ▲ | wkat4242 3 days ago | parent [-] | | It does work, I have it running on my Radeon VII Pro | | |
| ▲ | Filligree 2 days ago | parent [-] | | It sometimes works. | | |
| ▲ | wkat4242 2 days ago | parent [-] | | How so? It's rock solid for me. I use ollama but it's based on llama.cpp It's quite fast also, probably because that card has fast HBM2 memory (it has the same memory bandwidth as a 4090). And it was really cheap as it was on deep sale as an outgoing model. | | |
| ▲ | Filligree 2 days ago | parent | next [-] | | "Sometimes" as in "on some cards". You're having luck with yours, but that doesn't mean it's a good place to build a community. | | |
| ▲ | wkat4242 2 days ago | parent [-] | | Ah I see. Yes, but you pick the card for the purpose of course. I also don't like the way they have such limited support on ROCm. But when it works it works well. I have Nvidia cards too by the way, a 4090 and a 3060 (the latter I use for AI also, but more for Whisper because faster-whisper doesn't do ROCm right now). |
| |
| ▲ | halJordan 2 days ago | parent | prev [-] | | Aside from the fact that gfx906 is one of the blessed architecture mentioned (so why would it not work). Like how do you look at your specific instance and then turn around and say "All of you are lying, it works perfectly." How do you square that circle in your head | | |
| ▲ | wkat4242 2 days ago | parent [-] | | No I was just a bit thrown by the "sometimes". I thought they were referring to a reliability issue. I am aware of the limited card support with ROCm and I complained about this elsewhere in the thread too. Also I didn't accuse anyone of lying. No need to be so confrontational. And my remark to the original poster at the top was from before they clarified their post. I just don't really see what AMD can do to make ollama work better other than porting ROCm to all their cards which is definitely something they should do. And no I'm not an AMD fanboi. I have no loyalty to anyone, any company or any country. |
|
|
|
|
|
|
| ▲ | mook 3 days ago | parent | prev | next [-] |
| I suspect part of it is also that Nvidia actually does a lot of things in firmware that can be upgraded. The new Nvidia Linux drivers (the "open" ones) support Turing cards from 2018. That means chips that old already do much of the processing in firmware. AMD keeps having issues because their drivers talk to the hardware directly so their drivers are massive bloated messes, famous for pages of auto-generated register definitions. Likely it's much more difficult to fix anything. |
| |
| ▲ | Evil_Saint 3 days ago | parent | next [-] | | Having worked at both Nvidia and AMD I can assure you that they both feature lots of generated header files. | |
| ▲ | bgnn 3 days ago | parent | prev [-] | | Hmm that is interesting. Can you elaborate what is exactly different between them? I'm asking because I think a firmware has to directly talk to hardware through lower HAL (hardware abstraction layer), while customer facing parts should be fairly isolated in the upper HAL. Some companies like to add direct HW acces to customer interface via more complex functions (often a recipe made out of lower HAL functions), which I always disliked. I prefer to isolate lower level functions and memory space from the user. In any case, both Nvidia and AMD should have very similar FW capabilities. I don't know what I'm missing here. | | |
| ▲ | Evil_Saint 3 days ago | parent | next [-] | | I worked on at both companies on drivers. The programming models are quite different. Both make GPUs but they were designed by different groups of people who made different decisions. For example: Nvidia cards are much easier to program in the user mode driver. You cannot hang a Nvidia GPU with a bad memory access. You can hang the display engine with one though. At least when I was there. You can hang an AMD GPU with a bad memory access. At least up to the Navi 3x. | |
| ▲ | raxxorraxor 3 days ago | parent | prev [-] | | Why isolate these functions? That will always cripple capabilities. With well designed interfaces, it doesn't lead to a mess and a more powerful device. Of course these lower level functions shouldn't be essential, but especially in these times you almost have to provide an interface here or be left behind by other environments. |
|
|
|
| ▲ | citizenpaul 3 days ago | parent | prev | next [-] |
| I've thought about this myself and come to a conclusion that your link reinforces. As I understand it most companies doing (EE)hardware design and production consider (CS) software to be a second class citizen at the the company. It looks like AMD after all this time competing with NVIDIA has not learned the lesson. That said I have never worked in hardware so I'm taking what I've heard from other people. NVIDIA while far from perfect has always easily kept stride in software quality ahead of AMD for over 20 years. While AMD repeatedly keeps falling on their face and getting egg all over themselves again and again and again as far as software goes. My guess is NVIDIA internally has found a way to keep the software people from feeling like they are "less than" the people designing the hardware. Sounds easily but apparently not. AKA management problems. |
| |
| ▲ | bgnn 3 days ago | parent [-] | | This is correct but one of the reasons is the SWE at HW companies are living in their own bubble. They somehow don't follow the rest of the SW developments. I'm a chip design engineer and I get frustrated with the garbage SW/FW team come up with, to the extent that I write my own FW library for my blocks. While doing that I try to learn the best practices, do quite a bit of research. One other reason is, SW was only FW till not long ago, which was serving the HW. So there was almost no input from SW to HW development. This is clearly changing but some companies, like Nvidia, are ahead of the pack. Even Apple SoC team is quite HW centric compared to Nvidia. |
|
|
| ▲ | Covzire 3 days ago | parent | prev | next [-] |
| That reeks of gross incompetence somewhere in the organization. Like a hosting company that has a customer dealing with very poor performance, over pays greatly to avoid it while the whole time nobody even thinks to check what the linux swap file is doing. |
| |
|
| ▲ | CoastalCoder 3 days ago | parent | prev | next [-] |
| I had a similar (I think) experience when building LLVM from source a few years ago. I kept running into some problem with LLVM's support for HIP code, even though I had not interest in having that functionality. I realize this isn't exactly an AMD problem, but IIRC it was they were who contributed the troublesome code to LLVM, and it remained unfixed. Apologies if there's something unfair or uninformed in what I wrote, it's been a while. |
|
| ▲ | tomrod 3 days ago | parent | prev [-] |
| Geez. If I were Berkshire Hathaway looking to invest in the GPU market, this would be a major red flag in my fundamentals analysis. |