▲	bigyabai 3 months ago
		I should point out here, if nobody has already; Nvidia's GPU designs are extremely complicated compared to what AMD and Apple ship. The "standard" is to ship a PCIe card with display handling drivers and some streaming multiprocessor hardware to process your framebuffers. Nvidia goes even further by adding additional accelerators (ALUs by way of CUDA core and tensor cores), onboard RTOS management hardware (what Nvidia calls GPU System Processor), and more complex userland drivers that very well might be able to manage atomics without any PCIe standards. This is also one of the reasons AMD and Apple can't simply turn their ship around right now. They've both invested heavily in simplifying their GPU and removing a lot of the creature-comforts people pay Nvidia for. 10 years ago we could at least all standardize on OpenCL, but these days it's all about proprietary frameworks and throwing competitors under the bus.
	▲	kimixa 3 months ago \| parent [-]
		FYI AMD also has similar "accelerators", with the 9070 having separate ALU paths for wmma ("tensor") operations much like Nvidia's model - older RDNA2/3 architectures had accelerated instructions but used the "normal" shader ALUs, if a bit beefed up and tweaked to support multiple smaller data types. And CUDA cores are just what Nvidia call their normal shader cores. Pretty much every subunit on a geforce has a direct equivalent on a radeon - they might be faster/slower or more/less capable, but they're there and often at an extremely similar level of the design. AMD also have on-die microcontrollers (multiple, actually) that do things like scheduling or pipeline management, again just like Nvidia's GSP. It's been able to schedule new work on-GPU with zero host system involvement since the original GCN, something that Nvidia advertise as "new" with them introducing their GSP (which just replaced a slightly older, slightly less capable controller rather than being /completely/ new too) The problem is that AMD are a software follower right now - after decades of under-investment they're behind on the treadmill just trying to keep up, so when the Next Big Thing inevitably pops up they're still busy polishing off the Last Big Thing. I've always seen AMD as a hardware company, with the "build it and they will come" approach - which seems to have worked for the big supercomputers who likely find it worth investing in their own modified stack to get that last few %, but clearly falls down selling to "mere" professionals. Nvidia, however, support the same software APIs on even the lowest end hardware, while nobody is likely running much on their laptop's 3050m in anger, it offers a super easy on-ramp for developers - and it's easy to mistake familiarity with superiority - you already know to avoid the warts so you don't get burned by them. And believe me, CUDA has plenty of warts. And marketing - remember "Technical Marketing" is still marketing - and to this day lots of people believe that the marketing name for something, or branding a feature, implies anything about the underlying architecture design - go to an "enthusiast" forum and you'll easily find people claiming that because Nvidia call their accelerator a "core" means it's somehow superior/better/"more accelerated" than the direct equivalent on a competitor, or actually believe that it just doesn't support hardware video encoding as it "Doesn't Have NVENC" (again, GCN with video encoding was released before a Geforce with NVENC). Same with branding - AMD hardware can already read the display block's state and timestamp in-shader, but Everyone Knows Nvidia Introduced "Flip Metering" With Blackwell!